аватар question@mail.ru · 01.01.1970 03:00

Error: PANDAS.RORORS.PARSERROROR: EROROR TOKENIZING DATA. C Error: Expected 1 Fields in Line 5, Saw 9

Trying TSV & NDash; File in Pandas.

   import  pandas  as  pddf = pd.read_csv (filiname, sep =  '' \ t ')   print  (df)  

after doing this after this Code in the console I see an error

  df = pd.read_csv (fileename, sp =  '\ t' ) ) file ) Class = ""> ""/Library/Frameworks/Python.framework/Versions/3.6/Lib/Python3.6/Site-packages/Io/Parsers.py ", line  655 ,  in  parser_f  retu  _readh_or_buffer, kwds) file  ""/Library/Frameworks/Python.framework/Versions/3.6/Lib/Python3.6/Site-packages/Io/Parsers.py ", line  411 ,  in  _readdata = PARSER.READ (NROWS) File  ""/Library/Frameworks/Python.framework/Versions/3.6/Lib/Python3.6/Site-packages/Io/Parsers.py ", line  982 ,  in  readret = self._engine.read (nrows) file  ""/Library/Frameworks/Python.framework/Versions/3.6/Lib/Python3.6/Site-packages/Io/Parsers.py ", line  1719 ,  in  readdata = self._reader.read (nrows) file  "" PANDAS/_LIBS/PARSERS.PYX "" , line  890 ,  in  pandas._libs.parsers.textRader.read (pandas/_libs/paramers.c:  10862 ) file ) file  "" Pandas/_LIBS/Parsers.pyx "" , line  912 ,  in  PANDAS._LIBS.PARSERS.TEXTREADER._READ_LOW_MEMERY (PANDAS/_LIBS/PARSERS.C:  11138 ) File  "" Pandas/_LIBS/Parsers.pyx "" , line  966 ,  in  pandas.pars.textRader._read_Rures (PANDAS/_LIBS/PARSERS.C:  11884 ) File  "" Pandas/_LIBS/PARSERS.PYX "" , Line  953 ,,,,,,,,  in  pandas._libs.parsers.textreader._tokenize_rows (pandas/_libs/Parsers.c:  11755 ) File  "" Pandas/_LIBS/Parsers.pyx "" , line  2184 ,  in  pandas.pars.raise_parser_parser_parser_parser_parser_parser_parser_ (Pandas/_libs/Parsers.c:  28765 ) Pandas.errors.parseroror: Error tokenizing Data. C Error: Expected  1  FIELDS  in  line  5 , saw  9      

tell me what the problem can be?

аватар answer@mail.ru · 01.01.1970 03:00

Most likely the error is that the data is not correct in the file (for example, the heading is in front of the data). When pandas tries to disassemble the file, the library needs to understand how many columns are created, and if in the first lines the number of data elements differs from the subsequent ones, then an error will arise.

I know about two solutions.

  • to ignore the error.

    pd.read_csv (filiname, sp = '\ t' , error_bad_lines = false )

    in this case the table will be created on the basis of the first line and everything that is not suitable for its format will be omitted. It is suitable if you have data from the first line, but where & ndash; then in the file they can & laquo; fly & raquo; Data.

  • Skip the headline

      pd.read_csv (Filename, Sep =  '' \ t ', skiprows = n)     

    in this case you will miss the first n file lines, starting with n+1. Suitable for working with files in which there are headlines.

  • Latest

    Similar