Fall Hours • COVID-19 Update
The Silicon Valley Campus is open 4–9:30 p.m. on Monday–Friday and 8 a.m.–5 p.m. on Saturday.
Data Analysis, Introduction | DBDA.X404
Data analysis is the process of transforming data into useful information to support decision making. It is the foundation for data mining, business intelligence, and predictive analytics. This course presents the tools, techniques and common practices used in the industry, including how to obtain, manipulate, explore, model, simulate and present data. It will help you build the essential technical skills to perform as data analyst or data scientist, and to continue other course studies in the certificate program.
The course examines different approaches to a data analysis project, with a framework for organizing an analytical effort. Popular tools for data analysis, such as R and Python can be used to carry out analysis, but R is used primarily in class instruction and examples. The course covers how to obtain and manipulate the raw data for use, as well as the basic exploratory analysis and common data analytical techniques such as regression, simulation, estimation and forecasting. It includes several graphing and visualization tools to understand the data and to present findings and results.
By the end of the course, you will learn a working framework to approach any data analysis project. You will be able to use R (or Python) to complete a large data analysis project, including a write-up with findings, insights and visuals. All tools used are open sourced.
At the conclusion of the course, you should be able to:
- Describe the framework to approach for the Data Analysis
- Discuss the importance of Data Analysis for Data Science, Data Visualization & exploration
- Explain the basic concepts of R and using R for Data Analysis
- Identify the right tools, concepts and functions that are required for Data Analysis
- Approaches to data analysis: Templates, write-ups and illustrative examples
- Overview of tools for data analysis: R, R-Studio (IDE) and comparison with Python
- Obtaining data: Finding data sets and Web scraping, file formats
- Data manipulation techniques: Data quality, reshaping data, appending and joining data sets
- Plotting and visualization: Exploration and presentation
- Exploratory data analysis: Visual inspection, descriptive analytics, insights
- Regression models: Simple, multiple and logistic
- Analysis report write-up and presentation, including graphs
- Simulation techniques: Fitting distributions, simulating stochastic processes
- Forecasting methods and applications: Smoothing, moving averages, time series, ARIMA
Skills Needed: Some programming experience is recommended. (R will be covered in class and used in examples. Python experience can be helpful.) Basic knowledge of probability and statistics required (at the level of basic statistics textbooks (see example: www.stattrek.com).
- Save your seat and help us confirm course scheduling. Enroll at least seven days before your course starts.
- ACCESSING CANVAS—Learn more about accessing your course on Canvas in our FAQ section.
Sections Open for Enrollment:
|Date:||Start Time:||End Time:||Meeting Type:||Location:|
|Sat, 01-08-2022||9:00 a.m.||12:00 p.m.||Flexible||SANTA CLARA / REMOTE|
|Sat, 01-15-2022||9:00 a.m.||12:00 p.m.||Flexible||SANTA CLARA / REMOTE|
|Sat, 01-22-2022||9:00 a.m.||12:00 p.m.||Flexible||SANTA CLARA / REMOTE|
|Sat, 01-29-2022||9:00 a.m.||12:00 p.m.||Flexible||SANTA CLARA / REMOTE|
|Sat, 02-05-2022||9:00 a.m.||12:00 p.m.||Flexible||SANTA CLARA / REMOTE|
|Sat, 02-12-2022||9:00 a.m.||12:00 p.m.||Flexible||SANTA CLARA / REMOTE|
|Sat, 02-19-2022||9:00 a.m.||12:00 p.m.||Flexible||SANTA CLARA / REMOTE|
|Sat, 02-26-2022||9:00 a.m.||12:00 p.m.||Flexible||SANTA CLARA / REMOTE|
|Sat, 03-05-2022||9:00 a.m.||12:00 p.m.||Flexible||SANTA CLARA / REMOTE|
|Sat, 03-12-2022||9:00 a.m.||12:00 p.m.||Flexible||SANTA CLARA / REMOTE|