Translate

Pages

Pages

Pages

Intro Video
Showing posts with label Featured Blog Posts - Data Science Central. Show all posts
Showing posts with label Featured Blog Posts - Data Science Central. Show all posts

Monday, November 2, 2020

A Process to minimize the gap between research and its applications

Human history is numerous pieces of evidence, doctrine, theologies through logic, and by extensive research in every field exists in the universe. 

There is a huge gap between research and its practical application. Studies should conduct with enormous responsibility with extreme efforts, but which has a very minimal effect on real-world problems. The majority of the research across the globe appears only in a few journals (s) /books / digital form or in a printed manifestation only that is forgotten by the real-world at the earliest and in the name of the new paradigm of investigate research they are re-inventing the wheel. A huge of this knowledge has been unused or never have the intention of using this with due diligence incomprehension or non-availability to the generations. Which is causing wasting valuable resources of an individual’s/organizations/ educational institutions and might be an enormous burden to their research and findings.

The previous research should be made available to the researchers across the globe to have unique research topics and to get valuable research findings. When any research conducted should not be based on individual preferences (or) benefits to them but keep humanity facing problems as a high priority and inclination to conduct studies.

The best and most resilient and most welcoming choice to reduce this gap is as follows:

  • It should be a need of the hour for global research institutions and corporate organizations should come together to co-operate and co-ordinate in conducting research
  • share common objectives, preferences, benefits and should encourage researchers to work on real-world issues
  • also, fund them to speed up the understanding and resolve the existing problems that the human race challenges in every aspect of their life and that can encourage humans to solve issues collectively
  • hence, establish global peace and transform the world as a universal safest place for living.

In this contemporary world, using the available technology, architecture, and computing power with artificial intelligence (AI) algorithms can effortlessly summarize the subsist research findings. It can have also help to find the research gaps and future research needs. These algorithmic findings are certainly guidance to the upcoming researchers to create their research objectives and motivate them to provide logic and enormous power to provide, investigate and find solutions where the to the problems in a better/optimal way to scale it up and implement across the globe.

Companies like Google, IBM Watson have already established their NLP tools to summarise the text information to a certain extent. Such companies can co-operate and co-ordinate with the research insinuations and then provide the unique methodologies which can be another level of the business model for the benefit of societies and companies as well.

It’s global researchers’ responsibility to avoid or not to spread the misleading research findings that are confusing to the entire communities across the globe; they should also stop the biased results. Many types of research have ushers to show their presence, which does not significantly have any impact on real-world issues. This habit should stop at the earliest or should able to use these capabilities, funds to find real solutions. And it’s also a huge responsibility to accept the bitter truth research findings and should also adopt.

Finally, exchange ideas and research findings will always help to build good societies for the human race around the globe.

I would be happy to receive your feedback or thought process to enhance this idea to be implemented. Please contact me @ varadivk@gmail.com or +91-7829033033 to discuss further.



from Featured Blog Posts - Data Science Central https://ift.tt/3mM5v47
via Gabe's MusingsGabe's Musings

Digital Transformation in the age of COVID-19

COVID-19, by limiting us to our homes and bringing the global economy to a standstill has proved that the world healthcare ecosystem is not prepared for a pandemic. However, what it did teach us is that as consumers we are prepared to adopt the digital era in all life facets.  

This new found pro digital movement, however, has shown a glaring picture as well - despite of the preparatory stages that businesses and governments had entered into years ago, there still exists a disequilibrium )of sky ground size) in the demand of going digital and supply of digital solutions support. 

To close the gap, businesses which had mapped their digital strategies in two to three years phases have brought down the initiative activation milestone to as low as a couple of days and weeks. 

In a European survey, nearly 70 percent of executives from regions like Germany, Switzerland, and Austria, said that the pandemic is possibly going to speed up their digital transformation efforts. This speedening is visible across a range of geographies and sectors.

Consider how:

  • their physical banking channels online. 
  • How the healthcare service providers have moved into telehealth while the insurers have started focusing on self-service assessment of claims. 
  • And the retailers have started focusing on contactless purchasing and delivery. 

A trend that we are seeing emerging is that digital foundations are helping top cream companies adapt and grow in the crisis situation quickly. Brands like Walmart, Amazon, Citrix, and Netflix etc. which are known to be the digital leaders are performing a lot better in this crisis - we are constantly seeing them doubling down their investments for widening the gap. On the other hand, the laggards still have a scope of catching up if they jump-start the digital chart at a greater speed. 

When we work with clients on their digital transformation journey, most of the CEO/CMO executives tell us that while they understand the end goals and the fact that it has become the only survival mode, they are unsure of how to get there with confidence and godspeed. 

What we suggest to them is to look into where they currently stand in addition to lowering their expectations. The reality for companies that have never been digital or have been keeping adoption their third or fourth priority is that they cannot just wake up one day and reduce costs and change how their employees view digital transformation and customers expectations - all the while pivoting toward newer growth opportunities.  

Once they realize where they presently stand and how far the bar has been raised, we start at a point where they understand the need better. This is where we help them bounce ahead from competition by helping them rethink transformation through these practices:

Cloud Migration

Digital leaders are able to develop a digital base and scale it across the business when they have a strong foundation in cloud. A foundation that is built on efficiency, innovation, and talent based advantages for delivering outcomes fast and differently. 

We ask them to make a cloud migration and expertise a leadership level agenda. We ask them to set a target of shifting a minimum of 60% of their business on the cloud in the next quarter.

Go Back to the New Basics

A good amount of clients that come to us seeking help with their digital transformation needs are top in their industries. However, for achieving true digital transformations, businesses have to relook and redefine their traditional modes of doing a process and delivering values. We ask them to learn from their employees, clients, and the emerging digital-only competitors to know how they are approaching the service delivery that they are. 

According to a McKinsey report, bold moves taken to adopt digital technologies at scale and early when combined with greater allocation of digital-focused resources aligns with high value creation. 

Retain forced “agility”

The moment the coronavirus was announced to be transmitted through physical spaces and interactions, the foundation of work from the office fell down like a house of cards.  Almost overnight, businesses became digital enough to enable organization-wide remote working. What was then forced has become a new trend now with employees and executives alike saying that they don’t want to go back. 

To retain this speed in digital adoption, look back at the challenges you faced - internally and externally - which prevented you from achieving your goals. Then, develop lean processes to aid decision making and streamline the procurement process, evolve the culture for aiding new working methods. 

In the situations of extreme unclarity, where we are living under today, the leadership teams have to learn what is working for them and what is not, quickly. This calls for identifying and leaning unknown elements as and when they appear. 

Refocus on Technologies 

The sudden shift to the virtual interactions and operations both outside and inside the organization offers an opportunity for speed up the process of learning about and adopting new technologies which your businesses must have only started experimenting with. 

Up until this point, as CEOs and CMOs you must have realized the pain points in the present technology stack. You also must have gotten a preview on the impact that the technology stack would have if carried forward. 

In the ode of adopting new technologies, we help our clients look at the process on these grounds:

  • Scalability - ever since you have shifted to virtual delivery of your services, have you been able to maintain the customer inflow and acquisition or has the number lowered. If it’s the same, is your digital architecture prepared to increase the count? For improving scalability, we generally advise our clients to work on their backend by moving to microservice architecture and leg up their cloud migration efforts. 
  • Data security - were there any data breaches you faced when you shifted to remote working mode and the subsequent data sharing practices? If yes, you can look into technologies like AI, Blockchain, IoT etc to make the data movement secure in the future. 
  • Usability - before the crisis, the consumers and business partners had little choice in how they accessed your services or products through the new digital offerings. The options, however, have expanded at a stage where we are coming out of the eye of the crisis. 

Relook your offerings - if they stack up your internal and external stakeholders expectations. If the usability is low, any digital consultancy firm would advise you to work on improving it by learning from your customers and employees. 

Expedite the Outcome 

The only way businesses can take advantage of digital-movement friendly time is by developing a culture of constant innovation. They should begin by developing a task force in businesses to look at contributing external factors, review the internal processes and limitations, identify new opportunities, and all the while, keep an eye on competition. 

It has become all the more crucial for them to partner with an external digital transformation consultancy firm which can help their team and processes understand what is lacking and look at new opportunities from a digital perspective. The opportunities should then be combined with a new fail-fast culture that helps in understanding customers' needs which are rapidly changing and then create prototypes to gain customers’ feedback and see how digital tools help move them into becoming loyal consumers. 

Accept that Perfection is Good’s Enemy

In our experience, there’s nothing that pulls back an organization’s digital adoption efforts than them aiming to be perfect - the perfect digital message, perfect tools, perfect optimization, etc. the sudden nature of COVID-19 has given birth to a shift from “after months of hypothesis and A/B testing, we have assembled the best digital plan” to “it works”. 

The present time has gotten the Agile Manifesto concept of “Working Software” the rightful space under the sun that it deserves.  

Parting Statement

It is more or less the rule of human nature that the best learnings happen in some of the most uncertain and devastating times. The present coronavirus crisis fits all the boxes of being an uncertain and devastating event. 

We know that the companies and even industries which simultaneously learn the newness and are open and quick to adapt them with the help of acceleration that digital offerings provide will be able to rise above day-to-day digital demands. The unique insights that they will draw from their employees and customers at this point will help them ensure that the digital future is a lot more robust as we come out of the COVID-19 crisis than it was when we were coming in.



from Featured Blog Posts - Data Science Central https://ift.tt/2Jt0lvv
via Gabe's MusingsGabe's Musings

4 Powerful Use Cases for Data Science in Finance

Data-driven solutions play a fundamental role in enhancing the services and profit margins of modern businesses in the finance sector. JP Morgan, one of the USA’s largest banking institutions, invests $11.5 billion a year in new technologies for this purpose. The company’s machine learning-based COiN platform reviews 12,000 annual commercial loan agreements in just a few hours, as opposed to the 360,000 man-hours it would take to do so manually. The benefits of applying data science in finance are diverse. And this is just one example.
There are a plethora of success stories demonstrating how major financial players capitalise on their data. But what's the significance of data science in finance now, in particular?
The coronavirus pandemic and the global measures that have followed have created a perfect economic storm. The financial sector stands at the front line of a growing credit crisis, with banks trying to manage disruption and maintain strict compliance amid social distancing guidelines which are at odds with their processes. Then there are the extraordinarily low interest rates and increasingly cash-insecure consumers to contend with. Some of the biggest banking challenges posed by the pandemic are:
  • Prioritising resources to cover the most critical business processes. Like many industries, the banking sector has found itself scrambling for answers and slow to make decisions on resourcing capacity, because it lacked an adequate data repository.
  • Delivering financial services off-site. Some essential financial operations, like branch banking, treasury or settlements, can only be done on-site. And the lack of a comprehensive customer database has prevented banks from being able to promptly accept and process payments from different accounts.
  • Dealing with the rising number of fraud cases. There have been numerous cases of critical data theft since COVID-19 first appeared. With rigorous data analysis, suspicious transactions could have been identified sooner and monetary fraud prevented.

To navigate the immediate obstacles, financial institutions must assess short-to-medium-term financial risks and adapt to new ways of operating in a post-pandemic world. Data science can be a powerful tool in finance, aiding risk management and continuity planning so that the industry is better prepared when the next challenge arises.

4 ways to harness data science within finance

A recent report from the World Economic Forum predicts that 463 exabytes of data will be generated daily by 2025. That‘s equal to 212 million DVDs a day, with an almost incomprehensible amount of actionable insights. Here are four key examples of how insurance, banking and investment companies can use data science to innovate the financial field.

1. Detect and prevent fraud

According to the American Bankers Association, banking institutions prevented $22 billion worth of fraudulent transactions in 2018. Now, using solutions powered by machine learning technologies, the finance industry is aiming at real-time fraud detection to minimise losses.

Machine learning enables the creation of algorithms that can learn from data, spot any unusual user behaviour, predict risks, and automatically notify financial companies of a threat. Data science helps banks recognise:

  • Fake insurance claims. With the help of machine learning algorithms, data provided by insurance agents, police, or clients can be analysed to spot inconsistencies more accurately than with manual checks.
  • Duplicate transactions and insurance claims. Duplicated invoices or claims aren’t always sinister, but machine learning algorithms can distinguish between an accidental click and a premeditated fraud attempt, thus preventing financial losses.
  • Account theft and suspicious transactions. Algorithms can analyse a user’s routine transactional data, then any suspicious activity can be flagged and verified by the card owner.

2. Manage customer data more efficiently

Financial institutions are responsible for managing vast amounts of customer data – transactions, mobile interactions and social media activity. This information can be categorised as “structured” or “unstructured” – the latter posing a real challenge when it comes to processing.

Employing data science within finance helps companies manage and store customers’ data far more efficiently. Firms can boost profits using AI-driven tools and technologies such as natural language processing (NLP), data mining and text analytics, while machine learning algorithms analyse data, identify valuable insights and suggest better business solutions.

3. Enable data-driven risk assessment

The financial industry faces potential risks from competitors, credits, volatile markets and more. Data science can help finance firms analyse their data to proactively identify such risks, monitor them, then prioritise and address them if investments become vulnerable.

Financial traders, managers, and investors can make reliable predictions around trading, based on past and present data. Data science can analyse the market landscape and customer data in real time, enabling financial specialists to take action to mitigate risks.

Data science can also be used in finance to implement a credit scoring algorithm. Using the wealth of available customer data, it can analyse transactions and verify creditworthiness far more efficiently.

4. Leverage customer analytics and personalisation

Data science is a powerful tool for helping financial institutions understand customers. Machine learning algorithms are able to gather insights on clients’ preferences, to improve personalisation and build predictive models of behaviour. Meanwhile, NLP and voice recognition software can improve communication with consumers. Thus, financial institutions can optimise business decisions and offer enhanced customer service.

Studying behavioural trends allows financial institutions to predict each consumer’s actions. Insurance companies use consumer analysis to minimise losses by defining below zero customers and measuring customer “lifetime value”.

Conclusion

The use of data science in the financial sector goes beyond fraud, risk management and customer analysis. Financial institutions can harness machine learning algorithms to automate business processes and improve security.

By using data science within finance, companies have new opportunities to win customer loyalty, safeguard their profits and stay competitive.

Originally published here



from Featured Blog Posts - Data Science Central https://ift.tt/3kNV5Aq
via Gabe's MusingsGabe's Musings

Data Science Movies Recommendation System

Nearly everybody wants to invest their recreation energy to watch motion pictures with their loved ones. We as a whole have a similar encounter when we sit on our lounge chair to pick a film that we will watch and go through the following two hours yet can't discover one following 20 minutes. It is so baffling. We unquestionably need a PC operator to give film proposals to us when we have to pick a film and spare our time.

Evidently, a film suggestion specialist has just become a fundamental aspect of our life. As indicated by Data Science Central "Albeit hard information is hard to obtain, many educated sources gauge that, for the significant online business stages like Amazon and Netflix, that recommenders might be liable for as much as 10% to 25% of steady income."

What is recommender System?

There are two types of recommendation systems. They are:

Content-Based Recommender System

A content-based recommender system functions on a user's generated data. We can create the data either directly (such as clicking likes) or indirectly (such as clicking links). This information is used to create a personal profile for the personal that includes the metadata of the user-interacted objects. The more reliable the device or engine collects results, the Interactive Recommender System becomes.

Collaborative Recommender System

A collaborative recommender system makes a suggestion based on how the item was liked by related people. Users with common preferences would be grouped by the system. Recommender schemes can also conduct mutual filtering using object similarities in addition to user similarities (such as 'Users who liked this object X also liked Y'). Most systems will be a combination of these two methods.

It is not a novel idea to make suggestions. Even if e-commerce was not so prevalent, retail store sales workers promoted goods to consumers for the purpose of upselling and cross-selling, eventually optimising profit. The goal of the recommendation programmes is exactly the same.

The recommendation system's other goal is to achieve customer satisfaction by delivering valuable content and optimising the time a person spends on your website or channel. It also tends to increase the commitment of customers. On the other hand, ad budgets can be tailored only for those who have a tendency to respond to them by highlighting products and services.

Why Recommendation systems?

1. They assist the customer with identifying objects of interest
2. Helps the provider of products distribute their products to the proper customer
(a) To classify, for each consumer, the most appropriate products
(b) Display each user customised content
(c) Recommend the correct customer with top deals and discounts
3. User interaction will enhance websites
4. This raises company profits by increased consumption.

Daily Life Examples of Movies Recommender Systems:

1.GroupLens
a) Helped in developing initial recommender systems by pioneering collaborative filtering model.
b) It also provided many data-sets to train models including Movie Lens and Book Lens

2. Amazon
a) Implemented commercial recommender systems
b) They also implemented a lot of computational improvements

3. Netflix
a) Pioneered Latent Factor/ Matrix Factorization models


4. Google
a) Search results in search bar
b) Gmail typing next word

5. YouTube
a) Making a playlist
b) Suggesting same Genre videos
c) Hybrid Recommendation Systems
d) Deep Learning based systems

Let’s go with the Coding part. The dataset link is: https://www.kaggle.com/rounakbanik/the-movies-dataset

 import pandas as pd  import numpy as np 
 df1=pd.read_csv('../input/movies-dataset/movie_dataset.csv')
df1.columns
df1.head(5)
import matplotlib.pyplot as plt
rich=df1.sort_values('budget',ascending=False)
fig, ax = plt.subplots()
rects1 = ax.bar(rich['title'].head(15),rich['budget'].head(15),
color=["Red","Orange","Yellow","Green","Blue"])
plt.xlabel("Movie Title")
plt.rcParams["figure.figsize"] = (50,50)
plt.title("Budget Wise top movies")
plt.ylabel("Movie Budeget")
def autolabel(rects):
for rect in rects:
height = rect.get_height()
ax.text(rect.get_x() + rect.get_width()/2., 1.05*height,
'%f' % float(height/100000),
ha='center', va='bottom')
autolabel(rects1)
plt.xticks(rotation=90)
plt.show()
rich1=df1.sort_values('vote_average',ascending=False)
rich1.head()
fig, ax = plt.subplots()
rects1 = ax.bar(rich1['title'].head(20),rich1['vote_average'].head(20),
color=["Red","Orange","Yellow","Green","Blue"])
plt.xlabel("Movie Title")
plt.rcParams["figure.figsize"] = (30,20)
plt.title("Rating Wise top movies")
plt.ylabel("Average rating")
def autolabel(rects):
for rect in rects:
height = rect.get_height()
ax.text(rect.get_x() + rect.get_width()/2., 1.05*height,
'%f' % float(height),
ha='center', va='bottom')
autolabel(rects1)
plt.xticks(rotation=90)
plt.show()
C= df1['vote_average'].mean()
print(C)
m= df1['vote_count'].quantile(0.9)
q_movies = df1.copy().loc[df1['vote_count'] >= m]
q_movies.shape
def weightedrating(x,m=m,C=C):
v = x['vote_count']
R = x['vote_average']
# Calculation based on the IMDB formula
return (v/(v+m) * R) + (m/(m+v) * C)
# A new column for weighted rating named weight_score in the dataset
q_movies['weight_score'] = q_movies.apply(weightedrating, axis=1)
#Sort movies based on score calculated above
q_movies = q_movies.sort_values('weight_score', ascending=False)
#Print the top 20 movies
q_movies[['title', 'vote_count', 'vote_average', 'weight_score']].head(20)
pop= df1.sort_values('popularity', ascending=False)
import matplotlib.pyplot as plt
plt.figure(figsize=(12,4))
plt.barh(pop['title'].head(5),pop['popularity'].head(5), align='center',
color=['red','pink','orange','yellow','green'])
plt.gca().invert_yaxis()
plt.xlabel("Popularity")
plt.title("Popular Movies")
df1['overview'].head(5)
features = ['keywords','cast','genres','director']
##Step 3: Create a column in DF which combines all selected features
for feature in features:
df1[feature] = df1[feature].fillna('')
def combine_features(row):
try:
return row['keywords'] +" "+row['cast']+" "+row["genres"]+" "+row["director"]
except:
print("Error:", row)
df1["combined_features"] = df1.apply(combine_features,axis=1)
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
cv = CountVectorizer()
count_matrix = cv.fit_transform(df1["combined_features"])
##Step 5: Compute the Cosine Similarity based on the count_matrix
cosine_sim = cosine_similarity(count_matrix)
sim_df = pd.DataFrame(cosine_sim,index=df1.title,columns=df1.title)
sim_df.head()
movie_user_likes = "Avatar"
sim_df[movie_user_likes].sort_values(ascending=False)[:20]
movie_user_likes = "Gravity"
sim_df[movie_user_likes].sort_values(ascending=False)[:20]
from scipy import sparse
from sklearn.metrics.pairwise import cosine_similarity
ratings = pd.read_csv("../input/colab-fitting/toy_dataset.csv",index_col=0)
ratings = ratings.fillna(0)
ratings
def standardize(row):
new_row = (row - row.mean())/(row.max()-row.min())
return new_row
ratings_std = ratings.apply(standardize)
item_similarity = cosine_similarity(ratings_std.T)
print(item_similarity)
item_similarity_df =
pd.DataFrame(item_similarity,index=ratings.columns,columns=ratings.columns)
item_similarity_df
def get_similar_movies(movie_name,user_rating):
similar_score = item_similarity_df[movie_name]*(user_rating-2.5)
similar_score = similar_score.sort_values(ascending=False)
return similar_score
print(get_similar_movies("romantic3",1))
action_lover = [("action1",5),("romantic2",1),("romantic3",1)]
similar_movies = pd.DataFrame()
for movie,rating in action_lover:
similar_movies = similar_movies.append(get_similar_movies(movie,rating),ignore_index=True)
similar_movies.head()
similar_movies.sum().sort_values(ascending=False)

In case the user or the movie is very new, we do not have many records to predict results. In such cases, the last value in the prediction will appear in recommendations and the performance of the recommendation system by comparing predicted values and original rating values. We will calculate the ‘RMSE’ (root mean squared error) value. In this case, the RMSE value is 0.9313, which one can judge if it is good or bad depending on the size of the dataset.

Disadvantages of Movie Recommendation system

It does not work for a new user who has not rated any item yet as enough ratings are required content-based recommender evaluates the user preferences and provides accurate recommendations.
No recommendation of serendipitous items.
Limited Content Analysis- The recommender does not work if the system fails to distinguish the items that a user likes from the items that he does not like.

Conclusion

In this article we discussed about recommender system, recommendation systems, daily real life examples and disadvantages of data science movie recommendation system.

Author Bio

Rohit Sharma is the Program Director for the upGrad-IIIT Bangalore, PG Diploma Data Analytics Program, one of the leading data science courses. Motivated to leverage technology to solve problems. Working on solving problems of scale and long term technology strategy.



from Featured Blog Posts - Data Science Central https://ift.tt/3oUoNWK
via Gabe's MusingsGabe's Musings

AI Generated Avatars Becoming Digital Influencers

As the recent rise in Covid-19 threatens once again to shutter advertising agencies, film studios, and similar media "factories" globally, a quiet, desperate shift is taking place in the creation of new media, brought about by increasingly sophisticated AI capabilities. A new spate of actors and models are making their way to people's screens, such as pink-haired Imma, above, who has developed an extensive following in Japan on Instagram and TikTok, and is appearing increasingly on the covers of Japanese magazines.

She also doesn't exist.

Imma joins a growing host of digital avatars who are replacing human actors, models, and photographers with computer-generated equivalents. Cloud-based GPUs and sophisticated game and modeling software have increasingly attracted the attention of a new generation of artist/programmers who are taking advantage of this to generate images, video, and audio that are becoming increasingly indistinguishable from reality, especially when that reality is otherwise captured via jump cuts, and matte overlays that have made tools such as TikTok and Reels the primary tools for video production for the typical Instagram celebrity.

The business potential for such virtual models and spokespeople is huge, according to a recent piece by Bloomberg on digital avatars. Such avatars have obvious benefits over their flesh and blood counterparts. They can appear in print or video anywhere - on a far-off beach, on a busy street in a bustling city, or staring out at dirigibles and flying saucers while taking a taxi above the clouds - without ever having to send a crew out for several days to some otherwise uninhabited Caribbean beach, reserving expensive permits for filming or dealing with observers, or spending a great deal of time with specialized green screen effects.

The models don't age out of roles, don't have bad hair days, or become prima donnas. The initial cost to develop such models may be fairly high (though seldom more than the cost of sending models and crew to a tropical island) but once created, that model becomes highly malleable, and can be used in a large number of different situations.

Ordinarily, these changes have been on the horizon for a while, and until the pandemic, the use of such models was increasing slowly anyway. Yet as for so many other things, the pandemic shifted the need for virtual models and actors into overdrive, as social distancing requirements and lockdowns put a very real limit on the ability of creative agencies to put together content with live actors.

It is significant, for instance, that The Walt Disney Company, which had already invested heavily in the use of CGI-based actors and props for its wildly popular series The Mandalorian, was able to finish production on the awaited second season of that show so quickly after California's Governor Newcom started loosening the limits on production. It did so by taking advantage of the hyperrealistic rendering capabilities of graphical processing units (GPUs) which are now capable of extensive texture mapping, shading, ray tracing, subdermal lighting and so forth by dint of the GPU architectures that were optimized for better gameplay.

Moreover, such processing can readily be done in parallel across multiple dedicated cloud processors, which not only increases the render time dramatically but also makes modeling, rigging (the process of positioning objects relative to one another) and lighting take far less time than they would on a single laptop. This also makes procedural shaders, which handle the animation of everything from hair to skin to dirt to water and smoke, feasible to the home-bound graphic artist or designer. Companies such as nVidia have partnered with cloud providers to make racks of GPUs acting in parallel available to anyone with a decent Internet connection even without necessarily needing to have such GPUs available on the artist's computers directly. 

This process is also driven by the rise of generative adversarial networks (a form of neural network that is able to take related types of images and build subtle composites that can then be evaluated to determine the "verisimilitude" of a given image. Those that survive are then used as the basis for other such images. When used as is, such GANs are remarkably effective at creating realistic portraits of people quickly, while at the same time also providing templates for the generation of models underlying such templates that can then be used for simulating action. The StyleGan2 algorithms drive the site This Person Does Not Exist, which generates unique, very realistic facial images. The site also contains detailed information about the StyleGan2 algorithms involved in the process.

The ability to create such realistic avatars has also led to the rise of fake identities in social media sites, as people take advantage of such tools to either disguise themselves on places as diverse as Facebook and Tinder, or simulate fake profiles for spamming and trolling, or, in at least a couple of cases, creating faked dossiers for political mischief, complete with people who existed only within a GPU. 

StyleGan2, in conjunction with similar research for capturing facial orientation and lip-synch movements, are also changing the nature of gaming as the ability to wear avatars that are able to speak and make facial gestures. From pre-generated audio recorded live, or text-to-speech interfaces such as are used to power a host of verbal virtual assistants (Siri being the Ur example), such viseme-matching algorithms are increasingly putting a face to what once had been primarily audio only "digital-companions".  

It is very likely that, for good or ill, this particular area of development will become one of the hottest faces of the AI movement. If the Japanese experience is any indication, such "Virtual Ambassadors" may very well become celebrities in their own rights. In many cases, the environments for creating such virtual models are likely to be the same -platforms that currently are used for developing immersive reality games, such as Microsoft's Unreal Engine or  Reallusion's Character Creator platform. As the pandemic continues to change how we work, it is likely that the next generation of advertising, media creation and immersive gaming will rely upon these virtual avatars, the next general of digital influencers.

Kurt Cagle is the Community Editor for Data Science Central.



from Featured Blog Posts - Data Science Central https://ift.tt/3elgkHg
via Gabe's MusingsGabe's Musings

Tuesday, October 27, 2020

Digital Twin, Virtual Manufacturing, and the Coming Diamond Age

If you have ever had a book self-published through Amazon or similar fulfillment houses, chances are good that the physical book did not exist prior to the order being placed. Instead, that book existed as a PDF file, image files for cover art and author photograph, perhaps with some additional XML-based metadata indicating production instructions, trim, paper specifications, and so forth.

When the order was placed, it was sent to a printer that likely was the length of a bowling alley, where the PDF was converted into a negative and then laser printed onto the continuous paper stock. This was then cut to a precise size that varied minutely from page to page depending upon the binding type, before being collated and glued into the binding.

At the end of the process, a newly printed book dropped onto a rolling platform and from there to a box, where it was potentially wrapped and deposited automatically before the whole box was closed, labeled, and passed to a shipping gurney. From beginning to end, the whole process likely took ten to fifteen minutes, and more than likely no human hands touched the book at any point in the process. There were no plates to change out, no prepress film being created, no specialized inking mixes prepared between runs. Such a book was not "printed" so much as "instantiated", quite literally coming into existence only when needed.

It's also worth noting here that the same book probably was "printed" to a Kindle or similar ebook format, but in that particular case, it remained a digital file. No trees were destroyed in the manufacture of the ebook.

Such print on demand capability has existed since the early 2000s, to the extent that most people generally do not even think much about how the physical book that they are reading came into existence. Yet this model of publishing represents a profound departure from manufacturing as it has existed for centuries, and is in the process of transforming the very nature of capitalism.

No alt text provided for this image

Shortly after these printing presses came online, there were a number of innovations with thermal molded plastic that made it possible to create certain types of objects to exquisite tolerances without actually requiring a physical mold. Ablative printing techniques had been developed during the 1990s and involved the use of lasers to cut away at materials based upon precise computerized instruction, working in much the same that a sculptor chips away at a block of granite to reveal the statue within.

Additive printing, on the other hand, made use of a combination of dot matrix printing and specialized lithographic gels that would be activated by two lasers acting in concert. The gels would harden at the point of intersection, then when done the whole would be flushed with reagents that removed the "ink" that hadn't been fixed into place. Such a printing system solved one of the biggest problems of ablative printing in that it could build up an internal structure in layers, making it possible to create interconnected components with minimal physical assembly.

The primary limitation that additive printing faced was the fact that it worked well with plastics and other gels, but the physics of metals made such systems considerably more difficult to solve - and a great deal of assembly requires the use of metals for durability and strength. By 2018, however, this problem was increasingly finding solutions for various types of metals, primarily by using annealing processes that heated up the metals to sufficient temperatures to enable pliability in cutting and shaping.

What this means in practice is that we are entering the age of just in time production in which manufacturing exists primarily in the process of designing what is becoming known as a digital twin. While one can argue that this refers to the use of CAD/CAM like design files, there's actually a much larger, more significant meaning here, one that gets right to the heart of an organization's digital transformation. You can think of digital twins as the triumph of design over manufacturing, and data and metadata play an oversized role in this victory.

No alt text provided for this image

At the core of such digital twins is the notion of a model. A model, in the most basic definition of the word, is a proxy for a thing or process. A runway model, for instance, is a person who is intended to be a proxy for the viewer, showing off how a given garment looks. An artist's model is a stand-in or proxy for the image, scene, or illustration that an artist is producing. An architectural model is a simulation of how a given building will look like when constructed, and with 3D rendering technology, such models can appear quite life-like. Additionally, though, the models can also simulate more than appearance - they can simulate structural integrity, strain analysis, and even chemistry interactions. We create models of stars, black holes, and neutron stars based upon our understanding of physics, and models of disease spread in the case of epidemics.

Indeed, it can be argued that the primary role of a data scientist is to create and evaluate models. It is one of the reasons that data scientists are in such increasing demand, the ability to build models is one of the most pressing that any organization can have, especially as more and more of a company's production exists in the form of digital twins.

There are several purposes for building such models: the most obvious is to reduce (or in some cases eliminate altogether) the cost of instantiation. If you create a model of a car, you can stress test the model, can get feedback from potential customers about what works and what doesn't in its design, can determine whether there's sufficient legroom or if the steering wheel is awkwardly placed, can test to see whether the trunk can actually hold various sized suitcases or packages, all without the cost of actually building it. You can test out gas consumption (or electricity consumption), can see what happens when it crashes, can even attempt to explode it. While such models aren't perfect (nor are they uniform), they can often serve to significantly reduce the things that may go wrong with the car before it ever goes into production.

However, such models, such digital twins, also serve other purposes. All too often, decisions are made not on the basis of what the purchasers of the thing being represented want, but what a designer, or a marketing executive, or the CEO of a company feel the customer should get. When there was a significant production cost involved in instantiating the design, this often meant that there was a strong bias towards what the decision-maker greenlighting the production felt should work, rather than actually working with the stake-holders who would not only be purchasing but also using the product wanted. With 3D production increasingly becoming a reality, however, control is shifting from the producer to the consumer, and not just at the higher end of the market.

Consider automobile production. Currently, millions of cars are produced by automakers globally, but a significant number never get sold. They end up clogging lots, moving from dealerships to secondary markets to fleet sales, and eventually end up in the scrapyard. They don't get sold primarily because they simply don't represent the optimal combination of features at a given price point for the buyer.

The industry has, however, been changing their approach, pushing the consumer much closer to the design process before the car is actually even built. Colors, trim, engine type, seating, communications and entertainment systems, types of brakes, all of these and more can be can be changed. Increasingly, these changes are even making their way to the configuration of the chassis and carriage. This becomes possible because it is far easier to change the design of the digital twin than it is to change the physical entity, and that physical entity can then be "instantiated" within a few days of ordering it.

What are the benefits? You end up producing product upon demand, rather than in anticipation of it. This means that you need to invest in fewer materials, have smaller supply chains, produce less waste, and in general have a more committed customer. The downside, of course, is that you need fewer workers, have a much smaller sales infrastructure, and have to work harder to differentiate your product from your competitors. This is also happening now - it is becoming easier for a company such as Amazon to sell bespoke vehicles than ever before, because of that digitalization process.

This is in fact one of the primary dangers facing established players. Even today, many C-Suite managers see themselves in the automotive manufacturing space, or the aircraft production space, or the book publishing space. Yet ultimately, once you move to a stage where you have digital twins creating a proxy for the physical object, the actual instantiation - the manufacturing aspect - becomes very much a secondary concern.

Indeed, the central tenet of digital transformation is that everything simply becomes a publishing exercise. If I have the software product to build a car, then ultimately the cost of building that car involves purchasing the raw materials and the time on a 3D printer, then performing the final assembly. There is a growing "hobbyist' segment of companies that can go from bespoke design to finished product in a few weeks. Ordinarily the volume of such production is low enough that it is likely tempting to ignore what's going on, but between Covid-19 reshaping retail patterns, the diminishing spending power of Millennials and GenZers, and the changes being increasingly required by Climate Change, the bespoke digital twin is likely to eat into increasingly thin margins.

Put another way, existing established companies in many different sectors have managed to maintain their dominance both because they were large enough to dictate the language that described the models and because they could take advantage of the costs involved in manufacturing and production creating a major barrier to entry of new players. That's now changing.

No alt text provided for this image

Consider the first part of this assertion. Names are important. One of the realizations that has emerged in the last twenty years is that before two people or organizations can communicate with one another, they need to establish (and refine) the meanings of the language used to identify entities, processes, and relationships. An API, when you get right down to it, is a language used to interact with a system. The problem with trying to deal with intercommunication is that it is generally far easier to establish internal languages - the way that one organization defines its terms - than it is to create a common language. For a dominant organization in a given sector, this often also manifests as the desire to dominate the linguistic debate, as this puts the onus of changing the language (a timeconsuming and laborious process) into the hands of competitors.

However, this approach has also backfired spectacularly more often than not, especially when those competitors are willing to work with one another to weaken a dominant player. Most successful industry standards are pidgins - languages that capture 80-90% of the commonality in a given domain while providing a way to communicate about the remaining 10-20% that typifies the specialty of a given organization. This is the language of the digital twin, the way that you describe it, and the more that organizations subscribe to that language, the easier it is for those organizations to interchange digital twin components.

To put this into perspective, consider the growth of bespoke automobiles. One form of linguistic harmonization is the standardization of containment - the dimensions of a particular component, the location of ports for physical processes (pipes for fluids, air and wires) and electronic ones (the use of USB or similar communication ports), agreements on tolerances and so forth. With such ontologies in place, construction of a car's digital twin becomes far easier. Moreover, by adhering to these standards, linguistic as well as dimensional, you still get specialization at a functional level (for instance, the performance of a battery) while at the same time being able to facilitate containment variations, especially with digital printing technology.

As an ontology emerges for automobile manufacturing, this facilitates "plug-and-play" at a macro-level. The barrier to entry for creating a vehicle drops dramatically, though likely not quite to the individual level (except for well-heeled enthusiasts). Ironically, this makes it possible for a designer to create a particular design that meets their criterion, and also makes it possible for that designer to sell or give that IP to others for license or reuse. Now, if history is any indication, that will likely initially lead to a lot of very badly designed cars, but over time, the bad designers will get winnowed out by long-tail market pressures.

Moreover, because it becomes possible to test digital twins in virtual environments, the market for digital wind-tunnels, simulators, stress analyzers and so forth will also rise. That is to say, just as programming has developed an agile methodology for testing, so too would manufacturing facilitate data agility that serves to validate designs. Lest this be seen as a pipe dream, consider that most contemporary game platforms can, with very little tweaking, be reconfigured for exactly this kind of simulation work, especially as GPUs increase in performance and available memory.

The same type of interoperability applies not just to the construction of components, but also to all aspects of resource metadata, especially with datasets. Ontologies provide ways to identify, locate and discover the schemas of datasets for everything from usage statistics to simulation parameters for training models. The design of that car (or airplane, or boat, or refrigerator) is simply one more digital file, transmissible in the same way that a movie or audio file is, and containing metadata that puts those resources into the broader context of the organization.

The long term impact on business is simple. Everything becomes a publishing company. Some companies will publish aircraft or automobiles. Others will publish enzymes or microbes, and still others will publish movies and video games. You still need subject matter expertise in the area that you are publishing into - a manufacturer of pastries will be ill-equipped to handle the publishing of engines, for instance, but overall you will see a convergence in the process, regardless of the end-product.

How long will this process take to play out? In some cases, it's playing out now. Book publishing is almost completely virtual at this stage, and the distinction between the physical object and the digital twin comes down to whether instantiation takes place or not. The automotive industry is moving in this direction, and drone tech (especially for military drones) have been shifting this way for years.

On the other hand, entrenched companies with extensive supply chains will likely adopt such digital twins approaches relatively slowly, and more than likely only at a point where competitors make serious inroads into their core businesses (or the industries themselves are going through a significant economic shock). Automobiles are going through this now, as the combination of the pandemic, the shift towards electric vehicles, and changing demographics are all creating a massive glut in automobile production that will likely result in the collapse of internal combustion engine vehicle sales altogether over the next decade along with a rethinking of the ownership relationship with respect to vehicles.

Similarly, the aerospace industry faces an existential crisis as demand for new aircraft has dropped significantly in the wake of the pandemic. While aircraft production is still a very high-cost business, the ability to create digital twins - along with an emergence of programming ontologies that make interchange between companies much more feasible - has opened up the market to smaller, more agile competitors who can create bespoke aircraft much more quickly by distributing the overall workload and specializing in configurable subcomponents, many of which are produced via 3D printing techniques.

No alt text provided for this image

Construction, likewise, is dealing with both the fallout due to the pandemic and the increasing abstractions that come from digital twins. The days when architects worked out details on paper blueprints are long gone, and digital twins of construction products are increasingly being designed with earthquake and weather testing, stress analysis, airflow and energy consumption and so forth. Combine this with the increasing capabilities inherent in 3D printing both full structures and custom components in concrete, carbon fiber and even (increasingly) metallic structures. There are still limitations; as with other large structure projects, the lack of specialized talent in this space is still an issue, and fabrication units are typically not yet built on a scale that makes them that useful for onsite construction.

Nonetheless, the benefits make achieving that scaling worthwhile. A 3D printed house can be designed, approved, tested, and "built" within three to four weeks, as opposed to six months to two years for traditional processes. Designs, similarly, can be bought or traded and modified, making it possible to create neighborhoods where there are significant variations between houses as opposed to the prefab two to three designs that tend to predominate in the US especially. Such constructs also can move significantly away from the traditional boxy structures that most houses have, both internally and externally, as materials can be shaped to best fit the design aesthetic rather than the inherent rectangular slabs that typifies most building construction.

Such constructs can also be set up to be self-aware, to the extent that sensors can be built into the infrastructure and viewscreens (themselves increasingly moving away from flatland shapes) can replace or augment the views of the outside world. In this sense, the digital twin of the instantiated house or building is able to interact with its physical counterpart, maintaining history (memory) while increasingly able to adapt to new requirements.

No alt text provided for this image

This feedback loop - the ability of the physical twin to affect the model - provides a look at where this technology is going. Print publishing, once upon a time, had been something where the preparation of the medium, the book or magazine or newspaper, occurred only in one direction - from digital to print. Today, the print resides primarily on phones or screens or tablets, and authors often provide live blog chapters that evolve in agile ways. You're seeing the emergence of processors such as FPGAs that configure themselves programmatically, literally changing the nature of the processor itself in response to software code.

It's not that hard, with the right forethought, to envision real world objects that can reconfigure themselves in the same way - buildings reconfiguring themselves for different uses or to adapt to environmental conditions, cars that can reconfigure its styling or even body shape, clothing that can change color or thermal profiles, aircraft that can be reconfigured for different uses within minutes, and so forth . This is reality in some places, though still piecemeal and one-offs, but the malleability of the digital twins - whether of office suites or jet engines - is the future of manufacturing.

The end state, likely still a few decades away, will be an economy built upon just-in-time replication and the importance of the virtual twin, where you are charged not for the finished product but the cost of the license to use a model, the material components, the "inks", for same, and the processing to go from the former to the latter (and back), quite possibly with some form of remuneration for recycled source. Moreover, as this process continues, more and more of the digital twin carries the burden of existence (tools that "learn" a new configuration are able to adapt to that configuration at any time). The physical and the virtual become one.

No alt text provided for this image

Some may see the resulting society as utopian, others as dystopian, but what is increasingly unavoidable is the fact that this is the logical conclusion of the trends currently at work (for some inkling of what such a society may be like, I'd recommend reading The Diamond Age by Neal Stevenson, which I believe to be very prescient in this regard).



from Featured Blog Posts - Data Science Central https://ift.tt/2TwPouM
via Gabe's MusingsGabe's Musings

FINTECH TRENDS: AI, SMART CONTRACTS, NEOBANKS, OPEN BANKING AND BLOCKCHAIN

What Is Fintech? 

"Fintech" describes the new technology integrated into various spheres to improve and automate all aspects of financial services provided to individuals and companies. Initially, this word was used for the tech behind the back-end systems of big banks and other organizations. And now it covers a wide specter of finance-related innovations in multiple industries, from education to crypto-currencies and investment management. 

While traditional financial institutions offer a bundle of services, fintech focuses on streamlining individual offerings, making them affordable, often one-click experience for users. This impact can be described with the word "disruption" - and now, to be competitive, banks and other conventional establishments have no choice but to change entrenched practices through cooperation with fintech startups. A vivid example is Visa's partnership with Ingo Money to accelerate the process of digital payments. Despite the slowdown related to the Covid-19 epidemic, the fintech industry will recover momentum and continue to change the finance world's face.

Fintech users

Fintech users fall into four main categories. Such trends as mobile banking, big data, and unbundling of financial services will create an opportunity for all of them to interact in novel ways:

  1. B2B - banks and their business clients
  2. B2C - small enterprises and individual consumers

The main target group for consumer-oriented fintech is millennials - young, ready to embrace digital transformation, and accumulating wealth.

What needs do they have? According to the Credit Carma survey, 85% of millennials in the USA suffer from burnout syndrome and have no energy to think about managing their personal finances. Therefore, any apps that automate and streamline these processes have a good chance to become popular. They need an affordable personal financial assistant that can do the following 24/7:

  • Analyze spending behaviors, recurrent payments, bills, debts
  • Present an overview of their current financial situation
  • Provide coaching and improve financial literacy

What they expect to achieve:

  • Stop overspending (avoid late bills, do smart shopping with price comparison, cancel unnecessary subscriptions, etc.)
  • Develop saving habits, get better organized
  • Invest money (analyze deposit conditions in different banks, form an investment portfolio, etc.)

The fintech industry offers many solutions that can meet all these goals - not only on an individual but also on a national scale. However, in many countries, there is still a high percentage of unbanked people - not having any form of a bank account. According to the World Bank report, this number was 1.7 billion people in 2017. Mistrust to new technologies, poverty, and financial illiteracy are the obstacles for this group to tap into the huge potential of fintech. Therefore, businesses and governments must direct the inclusion efforts towards this audience as all stakeholders will benefit from it. Apparently, affordable and easy-to-get fintech services customized for this huge group of first-time users will be a big trend in the future.

Big Data, AI, ML in Fintech

According to an Accenture report, AI integration will boost corporate profits in many industries, including fintech, by almost 40% by 2035, which equals staggering $14 trillion. Without a doubt, Big Data technologies, such as Streaming Analytics, In-memory computing, Artificial Intelligence, and Machine Learning, will be the powerhouse behind numerous business objectives banks, credit unions, and other institutions strive to achieve:

  • Aggregate and interpret massive amounts of structured and unstructured data in real-time.
  • With the help of predictive analytics, make accurate future forecasts, identify potential problems (e.g., credit scoring, investment risks)
  • Build optimal strategies based on analytical reports
  • Facilitate decision-making
  • Segment clients for more personalized offers and thus increase retention.
  • Detect suspicious behavior, prevent identity fraud and other types of cybercrime, make transactions more secure with such technologies as face and voice recognition.
  • Find and extend new borrower pools among the no-file/thin-file segment, widely represented by Gen Z (the successors of millennials), who lack or have a short credit history.
  • Automate low-value tasks (e.g., such back-office operations as internal technical requests)
  • Cut operational expenses by streamlining processes (e.g., image recognition algorithms for scanning, parsing documents, and taking further actions based on regulations) and reducing man-hours.
  • Considerably improve client experience with conversational user interfaces, available 24/7, and capable of resolving any issues instantly. Conversational banking is used by many big banks worldwide; some companies integrate financial chatbots for processing payments in social media.

Neobanks

Digital or internet-only banks do not have brick-and-mortar branches and operate exclusively online. The word neobank became widely used in 2017 and referred to two types of app-based institutions - those that provided financial services with their own banking license and those partnering with traditional banks. Wasting time in lines and paperwork - this inconvenience is the reason why bank visits are predicted to fall to just four visits a year by 2022. Neobanks, e.g., Revolut, Digibank, FirstDirect, offer a wide range of services - global payments and P2P transfers, virtual cards for contactless transactions, operations with cryptocurrencies, etc., and the fees are lower than with traditional banks. Clients get support through in-app chat. Among the challenges associated with digital banking are higher susceptibility to fraud and lower trustworthiness due to the lack of physical address. In the US, the development of neobanks faced regulatory obstacles. However, the situation is changing for the better.

Smart contracts

A smart contract is a software that allows automatic execution and control of agreements between buyers and sellers. How does it work? If two parties want to agree on a transaction, they no longer need a paper document and a lawyer. They sign the agreement with cryptographic keys digitally. The document itself is encoded in a tamper-proof manner. The role of witnesses is performed by a decentralized blockchain network of computing devices that receive copies of the contract, and the code guarantees the fulfillment of its provisions, with all transactions transparent, trackable, and irreversible. This sky-high level of reliability and security make any fintech operation possible in any spot of the world, any time. The parties to the contract can be anonymous, and there is no need for other authorities to regulate or enforce its implementation.

Open banking

Open banking is a system that allows third parties to access bank and non-bank financial institutions data through APIs (application programming interfaces) to create a network. Third-party service providers, such as tech startups, upon user consent, aggregate these data through apps and apply them to identify, for instance, the best financial products, such as savings account with the highest interest rate. Networked accounts will allow banks to accurately calculate mortgage risks and offer the best terms to low-risks clients. Open banking will also help small companies save time with online accounting and will play an important role in fraud detection. Services like Mint require users to provide credentials for each account, although such practice has security risks, and data processing is not always accurate. ÀPIs are a better option as they allow direct data sharing without accessing login and password. Consumer security is still compromised, and this is one of the main reasons why the open banking trend hasn't taken off yet. Many banks worldwide cannot provide open APIs of sufficient quality to meet existing regulatory standards. There are still a lot of blind spots, including those related to technology. However, open banking is a promising trend. The Accenture report offers many interesting insights.

Blockchain and cryptocurrencies

The distributed ledger technology - Blockchain, which is the basis of many cryptocurrencies, will continue to transform the face of global finance, with the US and China being global adoption leaders. The most valuable feature of a blockchain database is that data cannot be altered or deleted once it has been written. This high level of security makes it perfect for big data apps across various sectors, including healthcare, insurance, energy, banking, etc., especially those dealing with confidential information. Although the technology is still in the early stages of its development and will eventually become more suited to the needs of fintech, there are already Blockchain-based innovative solutions both from giants, like Microsoft and IBM, and numerous startups. The philosophy of decentralized finance has already given rise to a variety of peer to peer financing platforms and will be the source of new cryptocurrencies, perhaps even national ones. Blockchain considerably accelerates transactions between banks through secure servers, and banks use it to build smart contracts. The technology is also growing in popularity with consumers. Since 2009, when Bitcoin was created, the number of Blockchain wallet users has reached 52 million. A wallet is a layer of security known as "tokenization"- payment information is sent to vendors as tokens to associate the transaction with the right account.

Regtech

Regtech or regulation technology is represented by a group of companies, e.g., IdentityMind Global, Suade, Passfort, Fund Recs, providing AI-based SaaS solutions to help businesses comply with regulatory processes. These companies process complex financial data and combine them with information on previous regulatory failures to detect potential risks and design powerful analytical tools. Finance is a conservative industry, heavily regulated by the government. As the number of technology companies providing financial services is increasing, problems associated with compliance with such regulations also multiply. For instance, processes automation makes fintech systems vulnerable to hacker attacks, which can cause serious damage. Responsibility for such security breaches and misuse of sensitive data, prevention of money laundering, and fraud are the main issues that concern state institutions, service providers, and consumers. There will be over 2.6 billion biometric users of payment systems by 2023, so the regtech application area is huge.

In the EU, PSD2 and SCA aim to regulate payments and their providers. Although these legal acts create some regulatory obstacles for fintech innovations, the European Commission also proposes a series of alleviating changes, for instance, taking off the table paper documents for consumers. In the US, fintech companies must comply with outdated financial legislation. The silver lining is the new FedNow service for instantaneous payments, which is likely to be launched in 2023–2024 and provides a ready public infrastructure.

Insuretech

The insurance industry, like many others, needs streamlining to be more efficient and cost-effective and meet the demand of time. Insurtech companies are exploring new possibilities, such as ultra-customization of policies, behavior-based dynamic premium pricing, based on data from Internet-enabled devices, such as GPS navigators and fitness activity trackers, AI brokerages, on-demand insurance for micro-events, etc., through a new generation of smart apps. As we mentioned before, the insurance business is also subject to strict government regulations, and it requires close cooperation of traditional insurers and startups to make a breakthrough that will benefit everyone.



from Featured Blog Posts - Data Science Central https://ift.tt/3e3ZqwM
via Gabe's MusingsGabe's Musings

Cybersecurity Experts Discuss Company Misconception of The Cloud and More in Roundtable Discussion

Industry experts from TikTok, Microsoft, and more talk latest trends on cybersecurity & public policy.

Enterprise Ireland, Ireland’s trade and innovation agency, hosted a virtual Cyber Security & Public Policy panel discussion with several industry-leading experts. The roundtable discussion allowed cybersecurity executives from leading organizations to come together and discuss The Nexus of Cyber Security and Public Policy.

The panel included Roland Cloutier, the Global Chief Security Officer of TikTok, Ann Johnson, the CVP of Business Development - Security, Compliance & Identity at Microsoft, Richard Browne, the Director of Ireland’s National Cyber Security Centre, and Melissa Hathaway, the President of Hathaway Global Strategies LLC who formerly spearheaded the Cyberspace Policy Review for President Barack Obama and lead the Comprehensive National Cyber Security Initiative (CNCI) for President George W. Bush.

 Panelists discussed the European Cloud and the misconception companies have of complete safety and security when migrating to the Cloud and whether it is a good move for a company versus a big mistake. Each panelist also brought valuable perspective and experience to the table on other discussion topics including cyber security’s recent rapid growth and changes; the difference between U.S. and EU policies and regulations; who holds the responsibility for protecting consumer data and privacy; and more.

 “As more nations and states continue to improve upon cybersecurity regulations, the conversation between those developing policy and those implementing it within the industry becomes more important,” said Aoife O’Leary, Vice President of Digital Technologies, Enterprise Ireland. “We were thrilled to bring together this panel from both sides of the conversation and continue to highlight the importance of these discussions for both Enterprise Ireland portfolio companies and North American executives and thought leaders.”

 This panel discussion was the second of three events in Enterprise Ireland’s Cyber Demo Day 2020 series, inclusive of over 60 leading Irish cyber companies, public policy leaders, and cyber executives from many of the largest organizations in North America and Ireland.

 To view a recording of the Cyber Security & Public Policy Panel Discussion from September 23rd, please click here.

###

About Enterprise Ireland

Enterprise Ireland is the Irish State agency that works with Irish enterprises to help them start, grow, innovate, and win export sales in global markets. Enterprise Ireland partners with entrepreneurs, Irish businesses, and the research and investment communities to develop Ireland's international trade, innovation, leadership, and competitiveness. For more information on Enterprise Ireland, please visit https://enterprise-ireland.com/en/.



from Featured Blog Posts - Data Science Central https://ift.tt/37KUTOD
via Gabe's MusingsGabe's Musings

5 Steps to Collect High-quality Data

Obtaining good quality data can be a tough task. An organization may face quality issues when integrating data sets from various applications or departments or when entering data manually.

Here are some of the things a company can do to improve the quality of the information it collects:

1. Data Governance plan

A good data governance plan should not only talk about ownership, classifications, sharing, and sensitivity levels plus also follows in detail with procedural details that outline your data quality goals. It should also have the details of all the personnel involved in the process and each of their roles and more importantly a process to resolve/work through issues.

2. Data Quality Guidance

You should also have a clear guide to use when separating good data from bad data. You will have to calibrate your automated data quality system with this information, so you need to have it laid out beforehand.

3. Data Cleansing Process

Data correction is the whole point of looking for flaws in your datasets. Organizations need to provide guidance on what to do with specific forms of bad data and identifying what’s critical and common across all organizational data silos. Implementing a data cleansing manually is cumbersome as the business shifts, strategies dictate the change in data and the underlying process.

4. Clear Data Lineage

With data flowing in from different departments and digital systems, you need to have a clear understanding of data lineage – how an attribute is transformed from system to system interactions and provide the ability to build trust and confidence.

5. Data Catalog and Documentation

Improving data quality is a long-term process that you can streamline using both anticipations and past findings. By documenting every problem that is detected and associated data quality score to the data catalog, you reduce the risk of mistake repetition and solidify your data quality enhancement regime with time.

As stated above, there is just too much data out there to incorporate into your business intelligence strategy. The data volumes are building up even more with the introduction of new digital systems and the increasing spread of the internet. For any organization that wants to keep up with the times, that translates to a need for more personnel, from data curators and data stewards to data scientists and data engineers. Luckily, today’s technology and AI/ML innovation allow for even the least tech-savvy individuals to contribute to data management at the east. Organizations should leverage these analytics augmented data quality and data management platforms like DQLabs.ai to recognize immediate ROI and longer cycles of implementation.



from Featured Blog Posts - Data Science Central https://ift.tt/31KFqdG
via Gabe's MusingsGabe's Musings

Insights from the free state of AI repost

For the last few years, I have read the free state of AI report

Here are the list of insights which I found interesting

The full report and the download link is at the end of this article

 

AI research is less open than you think: Only 15% of papers publish their code

 

Facebook’s PyTorch is fast outpacing Google’s TensorFlow in research papers, which tends to be a leading indicator of production use down the line

 

PyTorch is also more popular than TensorFlow in paper implementations on GitHub

 

Language models: Welcome to the Billion Parameter club

Huge models, large companies and massive training costs dominate the hottest area of AI today, NLP.

 

Bigger models, datasets and compute budgets clearly drive performance

Empirical scaling laws of neural language models show smooth power-law relationships, which means that as model performance increases, the model size and amount of computation has to increase more rapidly.

 

Tuning billions of model parameters costs millions of dollars

Based on variables released by Google et al., you’re paying circa $1 per 1,000 parameters. This means OpenAI’s 175B parameter GPT-3 could have cost tens of millions to train. Experts suggest the likely budget was $10M.

 

We’re rapidly approaching outrageous computational, economic, and environmental costs to gain incrementally smaller improvements in model performance

Without major new research breakthroughs, dropping the ImageNet error rate from 11.5% to 1% would require over one hundred billion billion dollars! Many practitioners feel that progress in mature areas of ML is stagnant.

 

A larger model needs less data than a smaller peer to achieve the same performance

This has implications for problems where training data samples are expensive to generate, which likely confers an advantage to large companies entering new domains with supervised learning-based models.

 

Even as deep learning consumes more data, it continues to get more efficient

Since 2012 the amount of compute needed to train a neural network to the same performance on ImageNet classification has been decreasing by a factor of 2 every 16 months.

 

A new generation of transformer language models are unlocking new NLP use-cases

GPT-3, T5, BART are driving a drastic improvement in the performance of transformer models for text-to-text tasks like translation, summarization, text generation, text to code.

 

NLP benchmarks take a beating: Over a dozen teams outrank the human GLUE baseline

It was only 12 months ago that the human GLUE benchmark was beat by 1 point. Now SuperGLUE is in sight.

 

What’s next after SuperGLUE? More challenging NLP benchmarks zero-in on knowledge

A multi-task language understanding challenge tests for world knowledge and problem solving ability across 57 tasks including maths, US history, law and more. GPT-3’s performance is lopsided with large knowledge gaps.

 

The transformer’s ability to generalise is remarkable. It can be thought of as a new layer type that is more powerful than convolutions because it can process sets of inputs and fuse information more globally.

For example, GPT-2 was trained on text but can be fed images in the form of a sequence of pixels to learn how to autocomplete images in an unsupervised manner.

 

Biology is experiencing its “AI moment”: Over 21,000 papers in 2020 alone

Publications involving AI methods (e.g. deep learning, NLP, computer vision, RL) in biology are growing >50% year-on-year since 2017. Papers published since 2019 account for 25% of all output since 2000.

 

From physical object recognition to “cell painting”: Decoding biology through images

Large labelled datasets offer huge potential for generating new biological knowledge about health and disease.

 

Deep learning on cellular microscopy accelerates biological discovery with drug screens

Embeddings from experimental data illuminate biological relationships and predict COVID-19 drug successes.

 

Ophthalmology advances as the sandbox for deep learning applied to medical imaging

After diagnosis of ‘wet’ age-related macular degeneration (exAMD) in one eye, a computer vision system can predict whether a patient’s second eye will convert from healthy to exAMD within six months. The system uses 3D eye scans and predicted semantic segmentation maps.

 

 

AI-based screening mammography reduces false positives and false negatives in two large, clinically-representative datasets from the US and UK

The AI system, an ensemble of three deep learning models operating on individual lesions, individual breasts and the full case, was trained to produce a cancer risk score between 0 and 1 for the entire mammography case. The system outperformed human radiologists and could generalise to US data when trained on UK data only.

 

Causal reasoning is a vital missing ingredient for applying AI to medical diagnosis

Existing AI approaches to diagnosis are purely associative, identifying diseases that are strongly correlated with a patient’s symptoms. The inability to disentangle correlation from causation can result in suboptimal or dangerous diagnoses.

 

Model explainability is an important area of AI safety: A new approach aims to incorporate causal structure between input features into model explanations

A flaw with Shapley values, one current approach to explainability, is that they assume the model’s input features are uncorrelated. Asymmetric Shapley Values (ASV) are proposed to incorporate this causal information.

 

 

Reinforcement learning helps ensure that molecules you discover in silico can actually be synthesized in the lab. This helps chemists avoid dead ends during drug discovery.

RL agent designs molecules using step-wise transitions defined by chemical reaction templates.

American institutions and corporations continue to dominate NeurIPS 2019 papers

Google, Stanford, CMU, MIT and Microsoft Research own the Top-5.

 

 

The same is true at ICML 2020: American organisations cement their leadership position

The top 20 most prolific organisations by ICML 2020 paper acceptances further cemented their position vs. ICML 2019. The chart below shows their Publication Index position gains vs. ICML 2019.

 

Demand outstrips supply for AI talent

Analysis of Indeed.com US data shows almost 3x more job postings than job views for AI-related roles. Job postings grew 12x faster than job viewings in the last from late 2016 to late 2018.

US states continue to legislate autonomous vehicles policies

Over half of all US states have enacted legislation to related to autonomous vehicles.

 

 

Even so, driverless cars are still not so driverless: Only 3 of 66 companies with AV testing permits in California are allowed to test without safety drivers since 2018

The rise of MLOps (DevOps for ML) signals an industry shift from technology R&D (how to build models) to operations (how to run models)

25% of the top-20 fastest growing GitHub projects in Q2 2020 concern ML infrastructure, tooling and operations. Google Search traffic for “MLOps” is now on an uptick for the first time.

 

 

As AI adoption grows, regulators give developers more to think about

External monitoring is transitioning from a focus on business metrics down to low-level model metrics. This creates challenges for AI application vendors including slower deployments, IP sharing, and more:

 

Berkshire Grey robotic installations are achieving millions of robotic picks per month

Supply chain operators realise a 70% reduction in direct labour as a result.

 

 

Algorithmic decision making: Regulatory pressure builds

Multiple countries and states start to wrestle with how to regulate the use of ML in decision making.

 

 

GPT-3, like GPT-2, still outputs biased predictions when prompted with topics of religion

Example from the GPT-3 (left) and GPT-2 (right) with prompts and the model’s predictions, which contain clear bias. Models trained on large volumes of language on the internet will reflect the bias in those datasets unless their developers make efforts to fix this. See our coverage in State of AI Report 2019 of how Google adapted their translation model to remove gender bias.

Free download link is at state of ai report



from Featured Blog Posts - Data Science Central https://ift.tt/2HGmndJ
via Gabe's MusingsGabe's Musings