top of page

You can use a PowerShell script to count the number of lines in multiple text files saved to a folder.



Enter the file path for the folder after the Get-ChildItem command on the first line. Then specify the extension of the files to be analyzed towards the end of the first line after 'extension - eq'.


Get-ChildItem c:\foofolder\test2 -recurse | where {$_.extension -eq ".txt"} | % {

$_ | Select-Object -Property 'Name', @{

label = 'Lines'; expression = {

($_ | Get-Content).Length

}

}

} | out-file C:\foofolder\test2\lines1.txt

On the last line provide the file path for a new text file to which PowerShell will write the results. Open Windows PowerShell ISE (x86) and then enter the script in a new pane, and then press the play button on the toolbar.



The text file that is generated will list each file name in the source folder and show the number of lines in each in a column to the right.












I ran this script on a set of more than 100,000 text files (which turned out to consist of more than 9 million lines) and it finished the review in less than 30 minutes.


The script can also be used to find the number of lines in other files such as .csv files.


Be sure to enter the file paths in quotes if they include blank spaces.


Thanks to Hari Parkash for posting this script here.


 
 

In 2015, Magistrate Judge William V. Gallo, in United States ex rel. Carter v. Bridgepoint Educ., Inc., 305 F.R.D. 225 (S.D. Cal. 2015), ruled on a dispute over the production of data from backup tapes, and the format used for email productions.


The Plaintiffs contended that the Defendants were responsible for intentionally altering data transferred to backup tapes because litigation was anticipated, and this transfer constituted a form of intentional spoliation. The Defendants in turn asserted that the data was inaccessible and so the cost of production should shift to the Plaintiffs. The requested data was matrices used by the Defendants to track the their performance as enrollment advisors to the Plaintiffs, which in turn was used to determine how much employees were paid. The Plainiffs contended this is a violation of the Higher Education Act's prohibition against incentive payments.


The backup tapes in question were used for disaster recovery. The encrypted tapes could be used to store more than 1 TB of data. The Defendants stated that it was only possible to restore one tape per day, and that the full restoration process would take several months and cost more than $2.2 million for the data for all of the relevant custodians to be converted to native format. These facts would make the production unduly burdensome. The Defendants had been transferring data to backup tapes for a long time, prior to the suit's unsealing. (This is a qui tam action, which the Defendants only received notice of when the government chose not to intervene.)


The Plaintiffs argued that the Defendants as a large 'billion dollar' public company which emphasizes its technologcal capabilities should have the resources to handle the production, and noted that their suit concerned more than $2 billion in damages. They faulted the Defendants for failing to disclose how their backup tape system worked.


In his decision, Judge Gallo citing Zubulake v. UBS Warburg LLC, 217 F.R.D. 309 (S.D.N.Y. 2003)), acknowledged that a party is not entitled to cost shifting if it converts data into an inaccessible format when it's reasonably foreseeable that it will be discoverable in anticipated litigation. But he emphasized that the litigation must be probable not merely possible.


In rejecting the contention that the Defendants' deliberately made data inaccessible, the Court notes that, "[e]ven in making this accusation, Plaintiff acknowledge that this ESI has been placed onto 'backup tapes,' , thereby accepting Defendants' own description of the relevant ESI as 'inaccessible.' Dangerously, Plaintiffs have chosen to describe this storage system as adopted 'under the pretext or excuse of a business purpose,' , even though the use of backup tapes for non-active ESI has become standard business practice." Bridgepoint Educ., Inc., 305 F.R.D. at 241. The opinion cites dozens of holdings that ESI stored on backup tapes is inaccessible from a technological standpoint.


The Defendants did restore one backup tape to its native format which contained all of the emails between the relevant employee custodians and their superiors. This gives the Plaintiffs "an unfettered ability to examine almost every potentially relevant quantum of ESI" Id. at 242. The Defendants made a production in TIFF images of other less relevant emails. The Plaintiffs only offered their own attorneys' estimates of the cost of production, while the Defendants filed a declaration prepared by an expert. The Court also noted the Plaintiffs' failure to specify the exact data they were requesting. "If a party fails to identify

the form or forms in which it wishes ESI to be produced and any particular fields or types of metadata sought, the non-requesting party may rightly provide the ESI sought in the form in which it is regularly maintained. With Plaintiffs' request ambiguous as to form and format, Defendants were certainly reasonable in refusing to provide reasonably inaccessible ESI." Id. at 243. Judge Gallo rejected the claim of intentional spoliation because the Plaintiffs did not explain why the Defendants' storage process was unusual.


The Court also rejected that Plaintiffs' request for the production of emails from active storage in native format. It regarded a TIFF image production as a proper response to a "generic request for original documents". Id. at 245.

 
 
  • Nov 28, 2022

Updated: Dec 6, 2022

Note that while Microsoft no longer supports Internet Explorer, there is an Internet Explorer mode available in the Edge browser.


Various legacy applications are still programmed only to function in IE, but the latest Windows systems may not allow IE to run.


Under the options for the default browser in Settings, Edge will let you toggle to a mode that will use the old browser




 
 

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page