How Do You Read a Text File in Python

Reading and Writing Text Files

Overview

Teaching: threescore min
Exercises: 30 min

Questions

  • How tin can I read in data that is stored in a file or write data out to a file?

Objectives

  • Be able to open a file and read in the data stored in that file

  • Sympathise the divergence between the file name, the opened file object, and the information read in from the file

  • Be able to write output to a text file with simple formatting

Why practise we desire to read and write files?

Being able to open and read in files allows us to work with larger data sets, where it wouldn't exist possible to type in each and every value and store them one-at-a-time equally variables. Writing files allows us to process our data and then salve the output to a file so we can look at it later.

Right now, we will practice working with a comma-delimited text file (.csv) that contains several columns of data. Nonetheless, what y'all learn in this lesson tin can be applied to any full general text file. In the next lesson, you will learn another mode to read and process .csv data.

Paths to files

In order to open up a file, nosotros demand to tell Python exactly where the file is located, relative to where Python is currently working (the working directory). In Spyder, nosotros can do this past setting our electric current working directory to the binder where the file is located. Or, when we provide the file proper name, nosotros tin can give a complete path to the file.

Lesson Setup

We will work with the practise file Plates_output_simple.csv.

  1. Locate the file Plates_output_simple.csv in the directory habitation/Desktop/workshops/bash-git-python.
  2. Re-create the file to your working directory, domicile/Desktop/workshops/YourName.
  3. Brand sure that your working directory is also set to the folder abode/Desktop/workshops/YourName.
  4. Equally you are working, make sure that you lot salvage your file opening script(s) to this directory.

The File Setup

Allow'due south open and examine the structure of the file Plates_output_simple.csv. If yous open the file in a text editor, you will encounter that the file contains several lines of text.

DataFileRaw

However, this is fairly difficult to read. If you lot open the file in a spreadsheet program such as LibreOfficeCalc or Excel, yous tin encounter that the file is organized into columns, with each cavalcade separated by the commas in the paradigm higher up (hence the file extension .csv, which stands for comma-separated values).

DataFileColumns

The file contains one header row, followed past eight rows of data. Each row represents a single plate epitome. If we look at the column headings, we tin see that nosotros accept collected data for each plate:

  • The proper noun of the image from which the data was collected
  • The plate number (there were 4 plates, with each plate imaged at two different fourth dimension points)
  • The growth condition (either control or experimental)
  • The observation timepoint (either 24 or 48 hours)
  • Colony count for the plate
  • The average colony size for the plate
  • The pct of the plate covered by bacterial colonies

We will read in this data file so piece of work to analyze the data.

Opening and reading files is a 3-step process

We volition open and read the file in three steps.

  1. We volition create a variable to hold the proper noun of the file that we desire to open up.
  2. We will telephone call a open to open the file.
  3. We will phone call a function to actually read the data in the file and store it in a variable so that we tin process it.

And and then, at that place'southward ane more footstep to practise!

  • When we are done, nosotros should remember to close the file!

You tin can think of these 3 steps equally being similar to checking out a book from the library. Start, yous have to go to the catalog or database to find out which volume you need (the filename). Then, you have to become and get it off the shelf and open up the book up (the open function). Finally, to gain any information from the book, you have to read the words (the read function)!

Here is an example of opening, reading, and closing a file.

                          #Create a variable for the file name              filename              =              'Plates_output_simple.csv'              #This is only a cord of text              #Open the file              infile              =              open              (              filename              ,              'r'              )              # 'r' says we are opening the file to read, infile is the opened file object that we will read from              #Store the data from the file in a variable              information              =              infile              .              read              ()              #Print the data in the file              print              (              information              )              #close the file              infile              .              shut              ()                      

Once we accept read the information in the file into our variable information, we can care for it like whatever other variable in our code.

Utilise consequent names to make your lawmaking clearer

It is a proficient idea to develop some consistent habits about the way you open and read files. Using the same (or like!) variable names each time will go far easier for you to keep track of which variable is the name of the file, which variable is the opened file object, and which variable contains the read-in data.

In these examples, we volition use filename for the text string containing the file name, infile for the open file object from which we can read in information, and data for the variable holding the contents of the file.

Commands for reading in files

There are a variety of commands that let united states of america to read in information from files.
infile.read() will read in the entire file as a single cord of text.
infile.readline() will read in i line at a fourth dimension (each time you call this control, it reads in the next line).
infile.readlines() will read all of the lines into a list, where each line of the file is an detail in the list.

Mixing these commands tin have some unexpected results.

                          #Create a variable for the file name              filename              =              'Plates_output_simple.csv'              #Open the file              infile              =              open up              (              filename              ,              'r'              )              #Print the beginning two lines of the file              print              (              infile              .              readline              ())              print              (              infile              .              readline              ())              #call infile.read()              print              (              infile              .              read              ())              #shut the file              infile              .              close              ()                      

Find that the infile.read()command started at the third line of the file, where the first ii infile.readline() commands left off.

Think of it similar this: when the file is opened, a pointer is placed at the peak left corner of the file at the beginning of the first line. Any time a read function is called, the cursor or pointer advances from where it already is. The starting time infile.readline() started at the beginning of the file and advanced to the end of the get-go line. Now, the arrow is positioned at the beginning of the second line. The 2nd infile.readline() avant-garde to the end of the second line of the file, and left the arrow positioned at the get-go of the third line. infile.read() began from this position, and advanced through to the end of the file.

In general, if you desire to switch between the different kinds of read commands, you lot should close the file and and then open information technology again to start over.

Reading all of the lines of a file into a listing

infile.readlines() will read all of the lines into a list, where each line of the file is an item in the list. This is extremely useful, because once we have read the file in this way, we tin loop through each line of the file and process information technology. This arroyo works well on information files where the information is organized into columns similar to a spreadsheet, because information technology is likely that we will want to handle each line in the same way.

The example below demonstrates this arroyo:

                          #Create a variable for the file proper name              filename              =              "Plates_output_simple.csv"              #Open up the file              infile              =              open              (              filename              ,              'r'              )              lines              =              infile              .              readlines              ()              for              line              in              lines              :              #lines is a list with each particular representing a line of the file              if              'control'              in              line              :              print              (              line              )              #print lines for control condition              infile              .              close              ()              #close the file when you're done!                      

Using .split up() to separate "columns"

Since our data is in a .csv file, we can use the separate command to separate each line of the file into a list. This can be useful if we want to access specific columns of the file.

                          #Create a variable for the file name                            filename              =              "Plates_output_simple.csv"              #Open the file              infile              =              open up              (              filename              ,              'r'              )              lines              =              infile              .              readlines              ()              for              line              in              lines              :              sline              =              line              .              split              (              ','              )              # separates line into a listing of items.  ',' tells it to carve up the lines at the commas              print              (              sline              )              #each line is now a list              infile              .              close              ()              #Always close the file!                      

Consequent names, again

At first glance, the variable name sline in the example above may not make much sense. In fact, we chose it to be an abbreviation for "separate line", which exactly describes the contents of the variable.

You don't have to use this naming convention if you don't want to, only you should work to employ consistent variable names across your code for common operations like this. It volition make it much easier to open an erstwhile script and quickly understand exactly what it is doing.

Converting text to numbers

When we called the readlines() command in the previous code, Python reads in the contents of the file as a string. If nosotros want our code to recognize something in the file every bit a number, we need to tell it this!

For case, float('5.0') will tell Python to care for the text string 'five.0' equally the number v.0. int(sline[4]) volition tell our lawmaking to care for the text string stored in the 5th position of the listing sline as an integer (non-decimal) number.

For each line in the file, the ColonyCount is stored in the 5th column (index 4 with our 0-based counting).
Modify the lawmaking above to print the line only if the ColonyCount is greater than 30.

Solution

                                  #Create a variable for the file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  for                  line                  in                  lines                  [                  one                  :]:                  #skip the first line, which is the header                  sline                  =                  line                  .                  split                  (                  ','                  )                  # separates line into a list of items.  ',' tells it to split the lines at the commas                  colonyCount                  =                  int                  (                  sline                  [                  four                  ])                  #store the colony count for the line as an integer                  if                  colonyCount                  >                  xxx                  :                  impress                  (                  sline                  )                  #shut the file                  infile                  .                  close                  ()                              

Writing data out to a file

Often, we will want to write data to a new file. This is especially useful if we have done a lot of computations or data processing and we desire to be able to salve information technology and come back to it afterwards.

Writing a file is the aforementioned multi-step process

Just like reading a file, we will open up and write the file in multiple steps.

  1. Create a variable to concur the proper noun of the file that we want to open up. Oft, this volition be a new file that doesn't all the same be.
  2. Call a office to open up the file. This time, we will specify that we are opening the file to write into it!
  3. Write the data into the file. This requires some careful attention to formatting.
  4. When we are done, we should remember to close the file!

The code below gives an instance of writing to a file:

                          filename              =              "output.txt"              #w tells python we are opening the file to write into it              outfile              =              open up              (              filename              ,              'w'              )              outfile              .              write              (              "This is the commencement line of the file"              )              outfile              .              write              (              "This is the 2nd line of the file"              )              outfile              .              close              ()              #Close the file when we're done!                      

Where did my file cease up?

Any fourth dimension you lot open a new file and write to it, the file will exist saved in your current working directory, unless you specified a dissimilar path in the variable filename.

Newline characters

When you examine the file you simply wrote, you will encounter that all of the text is on the aforementioned line! This is because we must tell Python when to start on a new line by using the special string character '\n'. This newline character volition tell Python exactly where to start each new line.

The example below demonstrates how to use newline characters:

                          filename              =              'output_newlines.txt'              #west tells python we are opening the file to write into it              outfile              =              open up              (              filename              ,              'due west'              )              outfile              .              write              (              "This is the first line of the file              \n              "              )              outfile              .              write              (              "This is the second line of the file              \n              "              )              outfile              .              close              ()              #Close the file when we're done!                      

Go open the file you just wrote and and check that the lines are spaced correctly.:

Dealing with newline characters when you read a file

You may take noticed in the last file reading example that the printed output included newline characters at the terminate of each line of the file:

['colonies02.tif', '2', 'exp', '24', '84', 'iii.2', '22\n']
['colonies03.tif', '3', 'exp', '24', '792', 'iii', '78\northward']
['colonies06.tif', '2', 'exp', '48', '85', '5.two', '46\n']

We tin can get rid of these newlines by using the .strip() office, which will become rid of newline characters:

                              #Create a variable for the file proper name                filename                =                'Plates_output_simple.csv'                ##Open up the file                infile                =                open up                (                filename                ,                'r'                )                lines                =                infile                .                readlines                ()                for                line                in                lines                [                1                :]:                #skip the first line, which is the header                sline                =                line                .                strip                ()                #get rid of abaft newline characters at the terminate of the line                sline                =                sline                .                separate                (                ','                )                # separates line into a list of items.  ',' tells it to split the lines at the commas                colonyCount                =                int                (                sline                [                four                ])                #store the colony count for the line every bit an integer                if                colonyCount                >                thirty                :                print                (                sline                )                #close the file                infile                .                close                ()                          

Writing numbers to files

Just like Python automatically reads files in as strings, the write()function expects to only write strings. If we want to write numbers to a file, nosotros volition demand to "cast" them as strings using the part str().

The code below shows an example of this:

                          numbers              =              range              (              0              ,              ten              )              filename              =              "output_numbers.txt"              #w tells python we are opening the file to write into it              outfile              =              open              (              filename              ,              'w'              )              for              number              in              numbers              :              outfile              .              write              (              str              (              number              ))              outfile              .              close              ()              #Close the file when nosotros're done!                      

Writing new lines and numbers

Go open up and examine the file you just wrote. You will see that all of the numbers are written on the same line.

Modify the code to write each number on its own line.

Solution

                                  numbers                  =                  range                  (                  0                  ,                  10                  )                  #Create the range of numbers                  filename                  =                  "output_numbers.txt"                  #provide the file name                  #open the file in 'write' fashion                  outfile                  =                  open                  (                  filename                  ,                  'w'                  )                  for                  number                  in                  numbers                  :                  outfile                  .                  write                  (                  str                  (                  number                  )                  +                  '                  \n                  '                  )                  outfile                  .                  close                  ()                  #Close the file when we're done!                              

The file you lot merely wrote should be saved in your Working Directory. Open the file and check that the output is correctly formatted with one number on each line.

Opening files in dissimilar 'modes'

When nosotros take opened files to read or write data, nosotros take used the function parameter 'r' or 'w' to specify which "way" to open the file.
'r' indicates we are opening the file to read information from it.
'w' indicates we are opening the file to write information into information technology.

Be very, very careful when opening an existing file in 'west' mode.
'w' will over-write any data that is already in the file! The overwritten data will be lost!

If you desire to add together on to what is already in the file (instead of erasing and over-writing it), you lot tin can open the file in append way by using the 'a' parameter instead.

Pulling it all together

Read in the data from the file Plates_output_simple.csv that we have been working with. Write a new csv-formatted file that contains merely the rows for control plates.
You volition need to do the following steps:

  1. Open up the file.
  2. Use .readlines() to create a list of lines in the file. Then close the file!
  3. Open a file to write your output into.
  4. Write the header line of the output file.
  5. Employ a for loop to allow yous to loop through each line in the listing of lines from the input file.
  6. For each line, check if the growth condition was experimental or command.
  7. For the command lines, write the line of data to the output file.
  8. Shut the output file when you're done!

Solution

Here'southward 1 way to do information technology:

                                  #Create a variable for the file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open up                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  #We will process the lines of the file after                  #shut the input file                  infile                  .                  close                  ()                  #Create the file we volition write to                  filename                  =                  'ControlPlatesData.txt'                  outfile                  =                  open                  (                  filename                  ,                  'due west'                  )                  outfile                  .                  write                  (                  lines                  [                  0                  ])                  #This will write the header line of the file                                    for                  line                  in                  lines                  [                  1                  :]:                  #skip the first line, which is the header                  sline                  =                  line                  .                  separate                  (                  ','                  )                  # separates line into a listing of items.  ',' tells it to dissever the lines at the commas                  condition                  =                  sline                  [                  2                  ]                  #store the condition for the line as a cord                  if                  status                  ==                  "control"                  :                  outfile                  .                  write                  (                  line                  )                  #The variable line is already formatted correctly!                  outfile                  .                  shut                  ()                  #Close the file when we're washed!                              

Claiming Problem

Open and read in the information from Plates_output_simple.csv. Write a new csv-formatted file that contains only the rows for the command status and includes only the columns for Time, colonyCount, avgColonySize, and percentColonyArea. Hint: you can use the .bring together() function to bring together a list of items into a string.

                              names                =                [                'Erin'                ,                'Marker'                ,                'Tessa'                ]                nameString                =                ', '                .                join                (                names                )                #the ', ' tells Python to join the list with each item separated past a comma + space                print                (                nameString                )                          

'Erin, Mark, Tessa'

Solution

                                  #Create a variable for the input file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open up                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  #We will process the lines of the file later                  #close the file                  infile                  .                  shut                  ()                  # Create the file nosotros will write to                  filename                  =                  'ControlPlatesData_Reduced.txt'                  outfile                  =                  open up                  (                  filename                  ,                  'due west'                  )                  #Write the header line                  headerList                  =                  lines                  [                  0                  ]                  .                  split                  (                  ','                  )[                  3                  :]                  #This will render the list of column headers from 'time' on                  headerString                  =                  ','                  .                  join                  (                  headerList                  )                  #bring together the items in the listing with commas                  outfile                  .                  write                  (                  headerString                  )                  #There is already a newline at the end, and so no need to add together 1                  #Write the remaining lines                  for                  line                  in                  lines                  [                  ane                  :]:                  #skip the starting time line, which is the header                  sline                  =                  line                  .                  split                  (                  ','                  )                  # separates line into a list of items.  ',' tells information technology to split the lines at the commas                  status                  =                  sline                  [                  2                  ]                  #store the colony count for the line equally an integer                  if                  condition                  ==                  "command"                  :                  dataList                  =                  sline                  [                  3                  :]                  dataString                  =                  ','                  .                  join                  (                  dataList                  )                  outfile                  .                  write                  (                  dataString                  )                  #The variable line is already formatted correctly!                  outfile                  .                  close                  ()                  #Shut the file when we're washed!                              

Key Points

  • Opening and reading a file is a multistep procedure: Defining the filename, opening the file, and reading the data

  • Data stored in files can exist read in using a variety of commands

  • Writing information to a file requires attention to information types and formatting that isn't necessary with a print() argument

lawsonbruse1971.blogspot.com

Source: https://eldoyle.github.io/PythonIntro/08-ReadingandWritingTextFiles/

0 Response to "How Do You Read a Text File in Python"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel