Tutorials
Hands-On Python
Hands-On Python
  • Hands-On Python Tutorial For Real-World Business Analytics Problems
  • Preface
    • Section I. A Note From The Author
    • Section II. Tutorial Overview
    • Section III. What Is The Preflight Checklist?
    • Section IV. Supplimentery Material
  • Preflight Checklist
    • Section V. Select Your Difficulty Setting
    • Section VI. Download Anaconda
    • Section VII. Download PyCharm (Optional)
    • Section VIII. Download SQL Server Developer Edition
    • Section IX. Configure Database Environment
    • Section X. Download The Source Code
    • Section XI. Starting JupyterLab
    • Section XII. How To Get Help With This Tutorial
  • Language Basics
    • Lesson 1. Obligatory Hello World
    • Lesson 2. Code Comments
    • Lesson 3. Data Types
    • Lesson 4. Variables
    • Lesson 5. String Concatenation
    • Lesson 6. Arithmetic Operators
    • Lesson 7. Making Decisions
    • Lesson 8. Control Flow With if-elif-else
    • Lesson 9. Control Flow With while
    • Lesson 10. Data Structures Part I: List
    • Lesson 11. Data Structures Part II: Tuples
    • Lesson 12. Data Structures Part III: Dictionaries
    • Lesson 13. Looping With for
    • Lesson 14. Functions
    • Lesson 15. Importing Modules
    • Lesson 16. Python Programming Standards
  • Advanced Topics
    • Lesson 17. Functional Programing With map
    • Lesson 18. Generators
    • Lesson 19. Comprehensions
    • Lesson 20. Basic File Operations
    • Lesson 21. Working With Data In Numpy
    • Lesson 22. Working With Data In Pandas
    • Lesson 23. Working With JSON
    • Lesson 24. Making File Request Over HTTP And SFTP
    • Lesson 25. Interacting With Databases
    • Lesson 26. Saving Objects With Pickle
    • Lesson 27. Error Handling
    • Lesson 28. Bringing It All Together
  • Solutions To Real World Problems
    • Lesson 29. Download A Zip File Over HTTP
    • Lesson 30. Looping Over Files In A Directory
    • Lesson 31. Convert Comma Delmited Files To Pipe Delimited
    • Lesson 32. Combining Multiple CSVs Into One File
    • Lesson 33. Load Large CSVs Into Data Warehouse Staging Tables
    • Lesson 34. Efficiently Write Large Database Query Results To Disk
    • Lesson 35. Working With SFTP In The Real World
    • Lesson 36. Executing Python From SQL Server Agent
Powered by GitBook
On this page
  • Examples
  • Now you try it!
  1. Advanced Topics

Lesson 21. Working With Data In Numpy

To make sense of data, we frequently organize information in lists and perform numerical operations such as add, min, max, average on them. We can do these operations using builtin python lists and loops as discussed in previous tutorials, but why this overhead when we can do most such operations with just one function call using Numpy. Also, Looping through huge arrays (millions of records) for such operations becomes extremely slower without using optimized library as Numpy. Numpy makes it a lot easier for us to make such computations. Using this library, we can write few lines of code complete the analysis we are performing.

Numpy also allows us to easily access a portion of data using indexing and perform operations on that portion of data. Performing operations on a portion of data, especially when there are multiple lists, becomes cumbersome using builtin python lists. For instance, finding heights of students meeting some conditions like having grade between 50 and 70.

Examples

Example #1: MOAR Looping!

Let say a class has the following grades where all students failed. The teacher decides to double the grade of each student to pass some. Here is how you would have done it using builtin arrys and loops. You can it do the same using numpy without looping over data yourself. Here is how you can do it with numpy.

import numpy as np

grades = [20,10,30,40,10,20,12,14,15,16,14,12,16]

#old and busted
new_grades = []
for grade in grades:
    new_grades.append(grade*2)
print(new_grades)

#new hotness
grades = np.array(grades)
new_grades = grades*2
print(new_grades)

Example #2: Descriptive Statistics

Finding min,max,avg and other such numerical information of grades becomes a lot easier.

min_grade = grades.min()
max_grade = grades.max()
avg_grade = grades.sum()/len(grades)

print("Min grade: ",min_grade)
print("Max grade: ",max_grade)
print("Avg grade: ",avg_grade)

Example #3: Selecting And Filtering Values

Let's select all students having grades less than 25 and let's find max gradeof students meeting the above condition.

grades[grades<25]

grades[grades<25].max()

Now you try it!

Don't copy and past. Type the code yourself!

PreviousLesson 20. Basic File OperationsNextLesson 22. Working With Data In Pandas

Last updated 3 years ago