Studying Python 3.X, I went on the basics. Before that, I wrote on php.
now I decided to complete the plan:
now My code
import csv def gene_name ( gene, polfzm, genotype ): retu '_' .join ([gene, polfzm, genotype]). strip (). Replace (, ) data = {} geenes = [] of open ( 'example-2.csv' , '`` ) as file: reader = csv.dictreader (File, Delimiter = ';' ) for row in reader: group = row [ 'group' ] person_id = row [ 'person_id' ] ab = row [ '' genotype '] name = gene_name (row [ 'gene_name' ], row [ 'polfzm_name' ], ab) data [group] [name_id] = ab # arises Error Pre> and there is a problem. If I understand correctly, then a word is suitable for storing my data in a structured form - data . In PHP, I could easily create the structure/attachment of the data [group] [person_id] [name] = ab . Immediately there is an error keyerror: '3'
what's the mistake of my approach. In the courses that I passed, did not work with multidimensional structures. Did I choose the type of variable for storing data correctly - a dictionary? What is the right thing to do here?
the structure of the data that needs to be obtained
an example of data csv
Person_id; Gene_name; Polfzm_name; Genotype; Expr1; Group 1086 ; at1; f11; t/c RS2036914; take a rally; take; 3 1086 ; at1; f11; s/t rs2289252; take away; 3 1086 ; at1; itga2; c/t rs1126643; ct; take away; 3 1086 ; at1; itgb3; t/c; t/c RS5918; take a rally; 3 1086 ; at1; vegf; c/t rs3025039; take a rally; 3 1085 ; at2; f11; t/c rs2036914; tt; take a rally; 3 1085 ; at2; f11; s/t RS2289252; take a rally; 3 1085 ; at2; itga2; c/t rs1126643; tt; take on frequent; 3 1085 ; at2; itgb3; t/c rs5918; tt; take a rally; 3 1085 ; at2; vegf; c/t; RS3025039; take a rally; 3 23 ; Foreign Ministry 6 ; ac; alu inds/del; ID; class = ""> 1
23 ; Foreign Ministry 6 ; f1; thr312ala; pregnanthrm; 1 23 ; Foreign Ministry 6 ; f11; t/c RS2036914; tc; pregnanturm; 1 23 ; Foreign Ministry 6 ; f11; s/t rs2289252; ct; pregnant gram; 1 23 ; Foreign Ministry 6 ; f13; val34leu; valleu; pregnant gram; 1 23 ; the Ministry of Foreign Affairs 6 ; f2; g20210a; gg; pregnant gram; 1 23 ; Foreign Ministry 6 ; f5; arg506gln; gg; pregnant core; 1 23 ; FMI 6 ; itga2; c/t RS1126643; CC; PCP; 1 23 ; Foreign Ministry 6 ; itgb3; t/c RS5918; tc; class = ""> 1 23 ; Foreign Ministry 6 ; vegf; g-634c; gg; taken; 1 data - flat data sample from a relational database. T.K. There were connections, and the data was selected from several tables, many values from the line to the line are repeated. For example, the first 5 lines of data (not taking into account the hat) - belong to one person (person_id), one group (Group), but have different genes (Gene_name), etc.
, according to data, it is planned to make multiple comparisons, search for combinations, etc. - the order of ~ 4 million passes, so you need such a structure as not to run along all lines of CSV so many times
question@mail.ru
·