Tag: large-files
Posts of Tag: large-files
  1. Searching for string in massive files efficiently

    I found variants of this idea, but none that could get me (very new to python) to where I need to be. Here is the scenario: I have one massive, 27 gig hashfile.txt consisting of unique strings all on separate ...Learn More
  2. Read lines by number from a large file

    I have a file with 15 million lines (will not fit in memory). I also have a small vector of line numbers - the lines that I want to extract. How can I read-out the lines in one pass? I was hoping for a C func...Learn More
  3. Comparing two large files

    I need to write a program that will write to a file the difference between two files. The program has to loop through a 600 MB file with over 13.464.448 lines, check if a grep returns true on another file and t...Learn More
  4. Estimating the time to read a large file from SD Card on Android

    I want to get an idea (ballpark) of how much time it takes to read a large file (50MB to 100MB) stored on Android's SD card. I'm using the following code on Android 2.3.3 on a Google Nexus One. Will this give m...Learn More
  5. Sorting gigantic binary files with C#

    I have a large file of roughly 400 GB of size. Generated daily by an external closed system. It is a binary file with the following format: byte[8]byte[4]byte[n] Where n is equal to the int32 value of byte[4]...Learn More
  6. Parsing large (20GB) text file with python - reading in 2 lines as 1

    I'm parsing a 20Gb file and outputting lines that meet a certain condition to another file, however occasionally python will read in 2 lines at once and concatenate them. inputFileHandle = open(inputFileName, '...Learn More
  7. how to speed up the pattern search btw two lists : python

    I have two fastq files like the one given below. Each record in the file starts with '@'. For two such files, my aim is to extract records that are common btw two files. @IRIS:7:1:17:394#0/1 GTCAGGACAAGAAAGACAA...Learn More
  8. How to check if two large files are identical on amazon S3?

    I need to move large files (>5GB) on amazon S3 with boto, from and to the same bucket. For this I need to use the multipart API, which does not use md5 sums for etags. While I think (well only 98% sure) that m...Learn More
  9. Charting massive amounts of data

    We are currently using ZedGraph to draw a line chart of some data. The input data comes from a file of arbitrary size, therefore, we do not know what the maximum number of datapoints in advance. However, by ope...Learn More
  10. Is there a memory efficient and fast way to load big json files in python?

    I have some json files with 500MB. If I use the "trivial" json.load to load its content all at once, it will consume a lot of memory. Is there a way to read partially the file? If it was a text, line delimit...Learn More
  11. How to parse a large file taking advantage of threading in Python?

    I have a huge file and need to read it and process. with open(source_filename) as source, open(target_filename) as target: for line in source: target.write(do_something(line)) do_something_else...Learn More
  12. Large file not flushed to disk immediately after calling close()?

    I'm creating large file with my python script (more than 1GB, actually there's 8 of them). Right after I create them I have to create process that will use those files. The script looks like: # This is more com...Learn More