What is Data Science?

in #c-c-c2 years ago

Definition of data science
Data science combines several fields including statistics, scientific methods, artificial intelligence (AI) and data analytics to extract value from data. Those involved in data science are referred to as data scientists, and they combine a range of skills to analyze data collected from the web, smartphones, clients, sensors, and other sources into actionable insights.

Data science involves preparing data for analysis, including cleaning, aggregating, and processing data to perform advanced data analysis. Analytics applications and data scientists can then analyze the results to uncover patterns and enable business leaders to draw informed conclusions.

Data Science: An Untapped Resource for Machine Learning
Data science is one of the most exciting fields today. But why is it so important?

Because companies are sitting on a treasure trove of data. As modern technology has enabled the creation and storage of ever larger amounts of information, the amount of data has increased dramatically. It is estimated that 90 percent of the world's data was created in the last two years. For example, Facebook users upload 10 million photos every hour.

But that data often just sits in databases and data lakes, mostly untouched.

The wealth of data collected and stored by these technologies can bring transformational benefits to organizations and societies around the world, but only if we can interpret them. This is where data science comes in.

Data science uncovers trends and provides insights that businesses can use to make better decisions and create more innovative products and services. Perhaps most importantly, it allows machine learning (ML) models to learn from the vast amounts of data that is fed to them rather than relying primarily on business analysts to see what they can discover from the data.

Data is the foundation of innovation, but its value comes from the information that scientists can extract from it and then act on it.

What is the difference between data science, artificial intelligence and machine learning?
To better understand data science and how you can use it, it is equally important to know other terms related to the field, such as artificial intelligence (AI) and machine learning. Often you will find these terms used interchangeably, but there are nuances.

Here's a simple breakdown:

  • AI means making a computer mimic human behavior in some way.
  • Data science is a subset of AI and refers more to the overlapping areas of statistics, scientific methods, and data analysis that are used to extract meaning and understanding from data.
  • Machine learning is another type of AI that consists of techniques that allow computers to analyze data and create AI applications.
    And just in case, let's give one more definition.
  • Deep learning, which is a subset of machine learning that allows computers to solve more complex problems.

How data science is done
The process of analyzing and acting on data is iterative rather than linear, but the data science lifecycle typically goes like this for a data modeling project:

Planning: Define the project and its potential outcomes.

Data model building: Data scientists often use various open source libraries or database-embedded tools to build machine learning models. Often users want APIs to help with data ingestion, data profiling and visualization, or feature engineering. They will need the right tools, as well as access to the right data and other resources, such as computing power.

Model evaluation. Data scientists must achieve a high percentage of accuracy in their models before they can feel confident deploying it. Model evaluation typically generates a comprehensive set of scorecards and visualizations to measure the performance of the model against new data and rank them over time to ensure optimal behavior in production. Model evaluation goes beyond the original performance and takes into account the expected baseline behavior.

Explanation of models. Explaining the inner mechanics of the results of machine learning models from a human perspective has not always been possible, but it is becoming increasingly important. Data scientists need automated explanations of the relative weight and importance of factors that influence forecast generation, as well as explanatory details of model-specific forecasts.

Model deployment. Using a trained machine learning model and implementing it into the right systems is often a complex and time-consuming process. This can be simplified by using models as scalable and secure APIs, or by using machine learning models in the database.

monitoring models. Unfortunately, model deployment isn't everything. Models should always be monitored after deployment to ensure they are working correctly. The data on which the model was trained may no longer be relevant for future predictions after some time has passed. For example, when fraud is discovered, criminals always come up with new ways to break into accounts.

For a data science consultation, please contact https://data-science-ua.com/

Sort:  

Hello, @coddyg

This is @fionasfavourites from the @ocd (Original Content Decentralized) curation team. We noticed you shared your first post here on Hive - congratulations and welcome! It would also be awesome if you could do an introduction post, so our community can get to know you better. For an example of what an intro post is like, you can check out this one by my friend & curation team member - Keeping Up With the Buzz – My Introduction to the Hive Community.

Speaking of community, we have many different ones here on the blockchain, devoted to all kinds of interests. Here's a link so you can check them all out – Hive Communities.

Also since you're new, you may run into an RC (Resource Credits) error when trying to comment/post because you don't yet have enough Hive in your account yet. For assistance with a temporary delegation to get you started, be sure to check out the Gift Giver site.

Also, as this the hive can be quite confusing, the newly launched Newbies Guide is a growing repository of useful – easy to understand – posts about how the Hive ecosystem works.

For now, @lovesniper will follow your account and we are looking forward to seeing your intro post. Also, you are welcome to tag me (@fionasfavourites) and please mention @lovesniper in your intro post in order for us to be notified, so we can consider your post for OCD curation. Feel free to hop into the OCD Discord server if you have any questions!