Why Not Data Science?

Why Not Data Science?

Data Science this, Data Science that, What is it really about?

ds1.png IMG SOURCE: saarland-informatics-campus.de/en/studium-s..

At the time of this writing(2022), it is undeniable that data science has become an essential part of every sector, especially business and academic research. Examples of the application of data science include machine translation, robotics, speech recognition, the digital economy, and search engines. There has been a tremendous increase in available data due to the rise in the number of people using the internet and the massive volume of data accumulated by industries from their inventories or databases where they store data. As the demand for strategic, data-driven decision-making grows exponentially across the industries, the demand for data scientists also increases. As of 2022, the average pay for data scientists is $141,362 per annum in the US. In addition, data science is expected to grow by 22% in 2030, which is four times faster than a regular profession, according to the US Bureau of Labor Statistics (2021).

Brief History of Data Science

DS2.jpg IMG SOURCE: plopdo.com/2021/10/02/evolution-of-data-sci..

It is recorded that the term "data science" was coined in 1974 by Peter Naur as a suitable alternative to computer science. Data science began with statistics and has evolved to include concepts from computer science. Data science is a subset of artificial intelligence that deals with data methods, scientific analysis, and statistics, all of which are used to extract insights and meaning from data.

Who is a Data Scientist?

Ds3.png IMG SOURCE: kdnuggets.com/2022/01/deep-look-13-data-sci..

According to Wikipedia, data are facts and statistics collected together for reference or analysis. A data scientist extracts valuable knowledge from big data using specific algorithms that discover patterns in the data. I prefer to refer to data scientists as interdisciplinary because they deal with all aspects of a problem, from data collection and preprocessing to model building and deployment.

What Do Data Scientists Do?

ds11.jpg IMG SOURCE: forbes.com/sites/sophiamatveeva/2019/11/29/..

The tasks of a data scientist may involve getting the data ready for analysis; exploring and visualizing data; creating models; and integrating these models into applications. Some organizations require some other skills during a job application. Therefore, I will advise that you check job description requirements for different data science opportunities.

  1. Data Extraction: Data is collected from different sources like websites using web scrapping or databases using MySQL, IBM DB2 and the like. This data is transformed by manipulating its form, column names, record, and file structures. This transformation makes it suitable for a data scientist to work with. Data Science can be a good career option if you have a background in ETL (Extract, Transform, and Load).

  2. Data Wrangling: Most data collected are noisy, i.e. dirty and messy. Data cleaning, manipulation, and organization are all part of data wrangling. Python, R, and Flume are popular tools used for this phase.

  3. Data Exploration: Adequate knowledge of the data you are working on and what problem you want to solve will help you choose the best approach to working with the data to get the answer you need. To do this, use charts, graphs, or diagrams to comprehensively show trends, outliers, unexpected results, correlations, and patterns. Python libraries such as Pandas or other tools such as Power BI are used to explore the data.

  4. Machine Learning: Machine learning is the science of building intelligent machines that can think, analyze, and make decisions. Building accurate machine learning models increases the likelihood of your organization finding profitable opportunities and avoiding unexpected threats. Models are built using regression, decision tree, and clustering algorithms, amongst others. High precision is desired so that the model can accurately predict unseen data.

  5. Big Data Processing: Big data refers to extensive data collection. Training a deep learning model requires a massive amount of data. Creating accurate deep learning models was previously impossible due to a lack of data and computer capacity. However, large amounts of data are being generated rapidly due to the exploding of data. This data may be organized or unstructured, but regular data processing tools cannot process it. Apache Spark, Hadoop, and Tableau are used to handle big data.

  6. Deployment: Building a model is usually not the end of a project. The model has to be deployed to make it available to the users. The model can be integrated into applications or deployed to the cloud for accessibility.

Advisable skills to learn to become a data scientist?

ds7.png IMG SOURCE: geeksforgeeks.org/top-10-data-science-skill..

To become a data scientist, you need to have an excellent background in math and statistics. Other necessary skills are

  1. Proficiency in Programming Languages such as Python, R or SAS.
  2. Data extraction, transformation and loading.
  3. Data Analysis and Visualization
  4. Machine Learning
  5. Big Data Processing
  6. Model Deployment

During this process of upskilling, do not forget to build simple projects because hands-on is the only way you can truly understand how the codes or tools work. A wise man once said, 'Practice makes perfect.

Pros and Cons of Data Science

ds10.png IMG SOURCE: cooltool.medium.com/pros-and-cons-of-neurom..

Pros

  1. It is intriguing as it is constantly changing and developing daily. The subject is vast and offers enthusiasts so much to explore and consider.

  2. Innovative projects that can improve any sector you want to specialize in can be built using data science knowledge.

  3. There are countless employment prospects in the field of Data Science and the eventual riches you may attain as a professional Data Scientist.

  4. Most data science components, such as data storage, programming, AI, computer vision, and natural language processing, will continue being in high demand over the next few years.

Cons

  1. Data scientists have found that if simple algorithms work better, they may be wasting time and energy developing unnecessarily complex algorithms.

  2. Some data science concepts require more work and attention than others. Understanding the math details and the underlying code for all topics can be time-consuming and exhausting, especially if you have a busy schedule.

  3. When working on a project, the subject matter must be understood as you might not have any prior background before the project. Research needs to be done while you are expected to complete the given work assigned to you within the particular time.

The Future of Data Science

d13.png IMG SOURCE: mediawire.in/blog/industry/data-science-job..

Some data science processes are repetitive and are therefore automated to avoid time consumption. These automation technologies like AutoML is not made to replace data scientist job in the future. One of the benefits is to free data scientists from complex tasks and reduce human error, among others. The application of data science is incorporated into a wide range of sectors, and significant future advances include a fully automated public supermarket, accurate investment prediction, self-driving cars, and e-health, amongst others.

Hehe Disclaimer: these images are not mine, I get them off online<3. Huge credits to the owners all whom I shared their links below each image. The cover image source is btelligent.com/en/portfolio/data-science. This is my first blog post, so pardon my copy copy for now. I hope to get a technical writing job with an organization that has a great graphic designer who can deliver great contents. Until then...

I wrote this blog post for people who are interested or curious about Data Science but have zero idea about it. I hope you now have a clear understanding about the field. I am looking forward to your feedbacks. Pls drop your comment. Till next time!**