Pandas.errors.parsererror: error tokenizing data. c error: out of memory

pandas.errors.parsererror: error tokenizing data. C error: out of memory

This error occurs when pandas is unable to tokenize the data while reading it due to an out of memory error. This typically happens when the data being read is too large for the available memory.

To handle this error, you can try the following approaches:

  1. Reduce the size of the data: If the data file is too large, you can try downsizing it by selecting a smaller subset of the data. You can use pandas functions like head(), sample(), or query() to extract a portion of the data for testing or analysis.
  2. Read the data in chunks: Instead of loading the entire data into memory at once, you can read it in smaller chunks. For instance, you can use pandas’ read_csv() function with the chunksize parameter to read the data in chunks, process each chunk separately, and then combine the results.
  3. Increase available memory: If your system has limited memory, you can try increasing the available memory, for example by installing additional RAM or using a machine with higher specifications.
  4. Optimize memory usage: You can optimize the memory usage by converting data types of columns to more memory-efficient types. For example, using int8 instead of int64 if possible, or using category instead of object for columns with a limited number of unique values.

Let’s consider an example that demonstrates reading a large CSV file in chunks using pandas:

<!-- index.html -->
<html>
  <head>
    <title>Reading CSV in Chunks</title>
  </head>
  <body>
    <table id="data-table"></table>
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
    <script src="script.js"></script>
  </body>
</html>
// script.js
 
const table = $("#data-table");
const chunkSize = 10000; // Number of rows to be read in each chunk
let currentChunk = 0;
 
// Function to process each chunk
function processDataChunk(data) {
  // Process data and add rows to the table
  data.forEach(row => {
    const tr = $("");
    tr.append(`${row.column1}`);
    tr.append(`${row.column2}`);
    // Add more columns as needed
    table.append(tr);
  });
}
 
// Function to read the next chunk
function readNextChunk() {
  $.get(`data.csv?start=${currentChunk * chunkSize}&end=${(currentChunk + 1) * chunkSize}`)
    .done(data => {
      processDataChunk(data); // Process the chunk
      currentChunk++;
      if (data.length === chunkSize) {
        readNextChunk(); // Recursive call for next chunk
      }
    })
    .fail(() => {
      alert("Failed to read data. Please try again.");
    });
}
 
// Start reading the chunks
readNextChunk();

This example reads a large CSV file in chunks of 10,000 rows each using jQuery’s AJAX function. The data is processed and added to an HTML table in a web page.

Leave a comment