Litigation Support Tip of the Night

December 7, 2019

Files with the extension rst are reStructuredText files, a plain text format used by Python to generate documentation with more complex formatting.   They can be used with the Docutils library, but if you're looking for a quick and easy to convert .rst files, try installing the RST (reStructuredText) Viewer and Editor app for Chrome. 

Enter the text from the .rst file on the left, click the 'View as PDF' button at the top, and on the right a formatted PDF will be generated.  

December 6, 2019

Head's up!  Be on the lookout for malicious Python libraries!  As reported by Developer Tech and Naked Security there is an imposter version of the popular library dateutill which is called python3-dateutil with a one '1' at the end instead of a lowercase L, 'l'.   The malicious library can steal keys for Secure Shell (SSH), a network encryption protocol, and GnuPG Privacy guard, encryption software.

You can confirm which python modules you have installed with the pip freeze command.   In your scripts directory simply enter, 'pip freeze'. 

. . . and a list of installed modules will be generated. 

August 25, 2019

In Python you can use the fnmatch module to search file names in a given directory.   

1. Begin by importing the fnmatch module and the os module. 

2. Enter the path of the directory you need to search, and the pattern of the file type you're searching for:

3. Then enter the below script.   OS.walk will generate a list of file names by going through a directory. 

April 1, 2019

Here's a demonstration of how to use a Python script to find and replace one string with another in multiple text files.  (Note that I'm working in version 3.7 of Python IDLE. )

In this example we're working with three text files that each refer to a New York city, which is misidentified as being in New Jersey. 

Begin by importing the os module (which can enable operating system dependent functionality) and the re module for regular expression operations. 

>>> import os, re

Set the directory in which you want to work:

>>> directory = os.listdir('C:/foofolder8')

Confirm the current directory:
>>> os.chdir('C:/foofolder8')

Loop through each of the files in the folder:
>>> for file in directory:
    open_file = open(file,'r')
    read_file = open_file.read()

With re.compile set the string you want to replace:
    regex = re.compile('jersey')

With regex.sub set the string you want to insert in: 
    read_file = regex.sub('york', read_file)

Finally write in the new text:
    write_file = open(file, 'w')
    write_file.write(read_file)

The text files will be automatically updated.

Note that you may find an error in the last file. 

Thanks to Abder Rahman-Ali for posting this script here.   

March 4, 2019

Using the Openpyxl module you can use Python to create an Excel workbook and name its worksheets.  Follow these steps:

First import the openpyxl module

>>> import openpyxl

Then import the Workbook class from openpyxl
>>> from openpyxl import Workbook

Identify the workbook
>>> wb = Workbook()

Name the worksheet
>>> sheet = wb.active
>>> sheet.title = "identification"

Then save the workbook as a new Excel file
>>> wb.save('edrm4.xlsx')
 

A new workbook is created with a worksheet named 'identification'. 

Additional sheets can be created and named. 

>>> sheet2 = wb.create_sheet()

Specify which number worksheet you want to name.  1 is the second worksheet. 
>>> sheet2 = wb.worksheets[1]
>>> sheet2.title = "preservation"

It's necessary to save the workbook before the sheet will be created in the file. 
>>> wb.save('edrm4.xlsx')

February 23, 2019

A very useful script has been posted here, which can be used to search for a string in multiple text files.    Simply copy the below code into a text file and save it with the extension '.py'.   (Note that I have added this code:

k=input("press close to exit")         

. . . at the end to prevent the window from closing after the script is run.)

k=input("press close to exit")         
 

Double-click on the new file and it will prompt you to enter the path containing the text files you are searching through; the extension of the text files; and the string to be searched for. 

The results return each line in the file in which the string appears. 

#Import os module
import os

# Ask the user to enter string to search
search_path = input("Enter directory path to search : ")
file_type = input("File Type : ")
search_str = input("Enter the search string : ")

# Append a directory separator if not already present
if not (search_path.endswith("/") or search_path.endswith("\\") ): 
        search_path = search_path + "/"
                                                          
# If path does not exist, set search path to current directory
if not os.path.exists(search_path):
        search_path ="."

# Repeat for each file in the directory  
for fname in os.listdir(path=search_path):

   # Apply file type filter   
   if fname.endswith(file_type):

        # Open file for reading
        fo = open(search_path + fname)

        # Read the first line from the file
        line = fo.readline()

        # Initialize counter for line number
        line_no = 1

        # Loop until EOF
        while line != '' :
                # Search for string in line
                index = line.find(search_str)
                if ( index != -1) :
                    print(fname, "[", line_no, ",", index, "] ", line, sep="")

                # Read next line
                line = fo.readline()  

                # Increment line counter
                line_no += 1
        # Close the files
        fo.close()
        
k=input("press close to exit")         
 

February 21, 2019

Here's a quick demo of how to use Python to pull a designated range from a spreadsheet. 

In this example I'm working with this spreadsheet:

Import the the xlrd library

>>> import xlrd

Set the location of the file you are analyzing:
>>> loc = ("C:\FooFolder\Cities.xlsx")

Access the workbook:

>>> wb = xlrd.open_workbook(loc)

Designate the worksheet number you're reviewing:

>>> sheet = wb.sheet_by_index(0)

Get  a column heading:
>>> sheet.cell_value(0,0)

'City'

Get the total number of rows on the worksheet with data, and then print just the data in those rows from the first column

>>> for i in range(sheet.nrows):
    print(sheet.cell_value(i,0))

    
City
New York
Los Angeles
Chicago
Houston
Philadelphia
Phoenix
Montreal
Toronto
Mexico City
>>> 

Varying the script this way, will get all of the column headings:

>>> for x in range(sheet.ncols):
    print(sheet.cell_value(0,x))

    
City
State
Country

Check out Geeks for Geeks for more excellent Python tips. 

February 19, 2019

Here's a quick demo on how to count the number of rows and columns in an Excel file using Python. 

First open the command prompt in the scripts folder for Python.  Enter this command to get the xlrd library:

pip install xlrd

We'll analyze this Excel file:

Import the xlrd module
>>> import xlrd

set the location of the file you want to analyze:
>>> loc = ("C:\FooFolder\Cities.xlsx")

identify it was as an Excel workbook
>>> wb = xlrd.open_workbook(loc)

select the sheet you want to review
>>> sheet = wb.sheet_by_index(0)

we can read a particular cell's value:
>>> sheet.cell_value(0,0)
'City'

Count the total number of rows:
>>> print(sheet.nrows)
7

. . . and the total number of columns:
>>> print(sheet.ncols)
1
>>> 

February 18, 2019

Pandas is a data manipulation library for Python.  Among other things it can be used to sort Excel spreadsheets. 

Not that it will not be possible to install Pandas on Python 3.4 and lower.  If you try to install pandas with these versions you will get an error:

Upgrade to the latest version (3.7.2.) to install pandas.   Be sure to use the 32 bit or 64 bit version as appropriate.  After it is installed, open command prompt in the scripts subfolder of the directory where Python was installed.  Enter the command:

pip install pandas

Pandas will be installed.   Now when using Python 3.7.2, you will be able to successfully import pandas:

February 17, 2019

Tonight's tip shows how to perform another basic task using a basic Python script.   Using Python IDLE 3.4.2, we'll get a list of all of the Excel files in this directory:

Open Python IDLE and import the glob module.

>>> import glob

Then use this command to generate a list of the files with particular extension in the specified directory.
>>> glob.glob("C:\FooFolder\python\*.xlsx")

These results are returned:

Please reload

Please reload

Sean O'Shea has more than 15 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

 

All content provided on this blog is for informational purposes only. The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of this information.

 

This policy is subject to change at any time.

 

Some elements on this page did not load. Refresh your site & try again.

Contact Me With Your Litigation Support Questions:

seankevinoshea@hotmail.com

  • Twitter Long Shadow

© 2015 by Sean O'Shea . Proudly created with Wix.com