
Senior Consultant
Atos Origin
In the AI field, before speaking about artificial intelligence, we must talk about the data, the data sets, the data science, and the data that we use for teaching the machines, which must be robust. The more the data, the more robust it can be, but yet there is a catch in this: if you have a humongous amount of data, where the machine over-learns what it has to learn, it also creates a problem, and if the machine is down learning something, that also is a problem.
The data should be so specific to the point that the machine should pick up and learn what it needs to know. This will only ensure that the machine is learning perfectly, and when the new data arrives, it will be able to recognize the data and give an accurate output. Mostly, it is due to this over or under-learning of the machines where the accuracy level hits either 90%, which is good enough or 50%, which is certainly not an accepted value.
Once the data is perfect, we can solve the use cases and the desired problem. There are two types of data: structured data and unstructured data. In Machine learning, to get 90+ % accuracy, it is advisable to use only structured data sets, and the unstructured or image-related data sets are more suitable for deep learning, where we can get 90+% accuracy with unstructured data sets.
Deep learning functions just as our brain functions when our brain tries to recognize something or solve a particular problem. Our brain does that as soon as we get a problem; we try to filter out things, then apply our logical thinking and assumptions, and based on all that we know, we conclude about an issue. For e.g., if we are shown a photograph of a dog, then we recognize it as a dog since we have seen dogs since childhood, but if we are shown a photo of an animal that we haven’t seen. We can conclude that it can be a wolf when shown a fox.
To accurately create a humongous amount of data, we require strong data scientists to collect the desired data for the problem. When the issue arises, the data scientist should be involved in giving us the perfect data set and formulating the process for how the machines will learn that data. This is one of the most critical steps, and if this step is successfully passed, then down the line, it will be a cakewalk to apply deep learning concepts to that data.
How can the data scientist do that? The data scientist should first look for a massive amount of data and then check the columns and rows of that data. Are the required columns there, and if the rows are sufficient enough to produce a good amount of data? Then, by applying proper filters to that data, he will know all the different categories he needs to ensure they are there. Sorting the data will also help us understand it more appropriately. Data analytical tools can also help finalize the data, but the personal raw steps to check the data in an Excel document are the best way to check the data you have made. Then, the data visualization tools will have to be made to show if any missing data will be added to make the data more robust. A data scientist has to do all this.
Similarly, the deep learning fundamentals work. It has CNN, which is like a neural network. When we send a smiling person’s photo through it, with the humongous amount of data that it has learned, it will go through it and filter on the first layer; if it’s a human or animal, then in this filter layer, it will filter that it is a human. Then, in the second layer, it will filter if it is a man or a woman; in the third, if it is smiling, laughing, or crying; in the fourth, it will filter out the smiling man, and finally, give only one output, which will be the smiling face.
So, it depends on the profound learning developer to decide how many filters he wants to add before giving the output. He can test and decide. However, four filtering layers are mostly standard. Once the developer is happy and he gets 90% accuracy, it is OK. Still, if not, we have a backward propagation that helps the machine recognize the layer where it is failing, and then it will improve and try to learn from the mistake it made and then provide a more accurate value.
We can look into a practical use case when the above is possible. For example, think of this: when we have the cameras installed outside of the life and inside the lift, we can observe people’s moods when they come to the office, whether they are in a good mood, smiling mood, or low mood or disturbed mood, then they are either patiently or impatiently waiting for the lift to come, and when the lift arrives, if they are entering the lift fast or slowly and inside the life also we can observe their mood, and that can help us to analyze the person’s behavior and that all can be help the psychiatrist and the psychologist to decide the person’s moods accurately.
This is a play of machine learning, deep learning, Natural Language Processing, and other exciting concepts where the machine can provide 90 or 99% accurate results. We can now recognize the worth of AI because if we still think that AI is a dump and nothing is going to happen, we will see in this decade the amount of increment the AI does and how much it can interfere in our day-to-day lives. But let us not make AI behave wrongly and be our enemy; instead, we should make them act with us happily and make AI our friend.
– Kanu Butani is the Senior Consultant at Atos Origin