Tutorials
Hands-On Python
Hands-On Python
  • Hands-On Python Tutorial For Real-World Business Analytics Problems
  • Preface
    • Section I. A Note From The Author
    • Section II. Tutorial Overview
    • Section III. What Is The Preflight Checklist?
    • Section IV. Supplimentery Material
  • Preflight Checklist
    • Section V. Select Your Difficulty Setting
    • Section VI. Download Anaconda
    • Section VII. Download PyCharm (Optional)
    • Section VIII. Download SQL Server Developer Edition
    • Section IX. Configure Database Environment
    • Section X. Download The Source Code
    • Section XI. Starting JupyterLab
    • Section XII. How To Get Help With This Tutorial
  • Language Basics
    • Lesson 1. Obligatory Hello World
    • Lesson 2. Code Comments
    • Lesson 3. Data Types
    • Lesson 4. Variables
    • Lesson 5. String Concatenation
    • Lesson 6. Arithmetic Operators
    • Lesson 7. Making Decisions
    • Lesson 8. Control Flow With if-elif-else
    • Lesson 9. Control Flow With while
    • Lesson 10. Data Structures Part I: List
    • Lesson 11. Data Structures Part II: Tuples
    • Lesson 12. Data Structures Part III: Dictionaries
    • Lesson 13. Looping With for
    • Lesson 14. Functions
    • Lesson 15. Importing Modules
    • Lesson 16. Python Programming Standards
  • Advanced Topics
    • Lesson 17. Functional Programing With map
    • Lesson 18. Generators
    • Lesson 19. Comprehensions
    • Lesson 20. Basic File Operations
    • Lesson 21. Working With Data In Numpy
    • Lesson 22. Working With Data In Pandas
    • Lesson 23. Working With JSON
    • Lesson 24. Making File Request Over HTTP And SFTP
    • Lesson 25. Interacting With Databases
    • Lesson 26. Saving Objects With Pickle
    • Lesson 27. Error Handling
    • Lesson 28. Bringing It All Together
  • Solutions To Real World Problems
    • Lesson 29. Download A Zip File Over HTTP
    • Lesson 30. Looping Over Files In A Directory
    • Lesson 31. Convert Comma Delmited Files To Pipe Delimited
    • Lesson 32. Combining Multiple CSVs Into One File
    • Lesson 33. Load Large CSVs Into Data Warehouse Staging Tables
    • Lesson 34. Efficiently Write Large Database Query Results To Disk
    • Lesson 35. Working With SFTP In The Real World
    • Lesson 36. Executing Python From SQL Server Agent
Powered by GitBook
On this page
  1. Solutions To Real World Problems

Lesson 31. Convert Comma Delmited Files To Pipe Delimited

This is a simple example and is really just here to build on concepts.

When you are dealing with flat files, you are going to run into all kinds of issues. One of the issues you are going to deal with is having commas in data.

You might have to work with a file that has text fields containing user input. The problem will be that the file has not been prepared in such a way that software interprets these commas as text and not column delimiters. When this happens, you are going to have to do some clean up work to the file before you can load it into a database, otherwise those commas are going to cause problems on import.

Once you have cleaned up the file or even if you do not have to do any cleanup work at all, it is in your best interest to take the file and convert it to pipe delimited instead of comma delimited.

Pipe is the name for that thing over the backlash on your keyboard. Pipe is RARELY used in everyday typing, so there is a low probably that pipe will show up in text fields.

It does not matter what the source is. If you have to write a delimited flat file, write it out as pipe delimited.

Examples

Example #1: Convert File To Pipe Delimited

This example downloads the comma delimited flights dataset.

For simplicity, we will not write the comma delimited version of the file as it is a relatively small file and fits in memory. Instead, we will read it into memory and then write the pipe delimited version to the PipeDelimitedExample folder.

In order to write a file as pipe delimited with the to_csv function, you use the sep argument and set it to |.

import urllib.request
import os
import pandas as pd

url = 'https://query.data.world/s/paxejv4t6pn3el4ttskmx7rhxhz5ho'
script_dir = os.getcwd()
data_directory = 'data\\'
example_directory = 'PipeDelimitedExample\\'
file_name = 'flights.csv'
file_path = os.path.join(script_dir,data_directory,example_directory,file_name)

# Read csv data into pandas and write | delimited txt file
df = pd.read_csv(url,index_col=False, dtype=str)
df.to_csv(file_path, sep="|",index=False)

PreviousLesson 30. Looping Over Files In A DirectoryNextLesson 32. Combining Multiple CSVs Into One File

Last updated 3 years ago