Convert a huge txt-file into a datasetHow do you deal with very large datasets in Mathematica?Dealing with a huge datasetHow can I add a column into a existing Dataset?how to create Dataset after importing txt fileHow to SemanticImport Multiple Excel SheetsHow to convert this .txt data into a list of pointsconvert from a dataset to listImport Stackoverflow data and convert it into datasetConvert Matrix into a long form DatasetWhat's the best way to import such dataset?

Make me a minimum magic sum

Would a legitimized Baratheon have the best claim for the Iron Throne?

Searching for a sentence that I only know part of it using Google's operators

Which "exotic salt" can lower water's freezing point by 70 °C?

What's the difference between "ricochet" and "bounce"?

My C Drive is full without reason

Why doesn't a particle exert force on itself?

What does “two-bit (jerk)” mean?

Select list elements based on other list

What detail can Hubble see on Mars?

My large rocket is still flipping over

Do the Zhentarim fire members for killing fellow members?

In a series of books, what happens after the coming of age?

Where do 5 or more U.S. counties meet in a single point?

Magical Modulo Squares

Why doesn't increasing the temperature of something like wood or paper set them on fire?

Is it safe to keep the GPU on 100% utilization for a very long time?

Why was Gemini VIII terminated after recovering from the OAMS thruster failure?

A♭ major 9th chord in Bach is unexpectedly dissonant/jazzy

How to get the decimal part of a number in apex

Why did Dr. Strange keep looking into the future after the snap?

Assuming a normal distribution: what is the sd for a given mean?

Why is the episode called "The Last of the Starks"?

Bash prompt takes only the first word of a hostname before the dot



Convert a huge txt-file into a dataset


How do you deal with very large datasets in Mathematica?Dealing with a huge datasetHow can I add a column into a existing Dataset?how to create Dataset after importing txt fileHow to SemanticImport Multiple Excel SheetsHow to convert this .txt data into a list of pointsconvert from a dataset to listImport Stackoverflow data and convert it into datasetConvert Matrix into a long form DatasetWhat's the best way to import such dataset?













3












$begingroup$


My friend has this huge txt-log of sea levels. He wants to organize it into a dataset.



After importing it this file a used StringSplit to separate it into rows, then to singular elements



rawData = Import["rawData.txt"];
splitRawData = StringSplit[rawData, "%%"];
dataIwant = splitRawData[[19]];
FullForm[dataIwant];
splitDataIntoRows = StringSplit[dataIwant, "n"];
splitData1 = StringSplit[splitDataIntoRows, " "];


I want to use this function to split the data into 6 columns.



convertListToAssociation = 
list [Function]
AssociationThread["Time (kyr BP)", "Sea level (m)", "T_NH(deg C)",
"T_dw (deg C)", "delta_w", "delta_T", list]


What are further steps to be taken?










share|improve this question









$endgroup$
















    3












    $begingroup$


    My friend has this huge txt-log of sea levels. He wants to organize it into a dataset.



    After importing it this file a used StringSplit to separate it into rows, then to singular elements



    rawData = Import["rawData.txt"];
    splitRawData = StringSplit[rawData, "%%"];
    dataIwant = splitRawData[[19]];
    FullForm[dataIwant];
    splitDataIntoRows = StringSplit[dataIwant, "n"];
    splitData1 = StringSplit[splitDataIntoRows, " "];


    I want to use this function to split the data into 6 columns.



    convertListToAssociation = 
    list [Function]
    AssociationThread["Time (kyr BP)", "Sea level (m)", "T_NH(deg C)",
    "T_dw (deg C)", "delta_w", "delta_T", list]


    What are further steps to be taken?










    share|improve this question









    $endgroup$














      3












      3








      3





      $begingroup$


      My friend has this huge txt-log of sea levels. He wants to organize it into a dataset.



      After importing it this file a used StringSplit to separate it into rows, then to singular elements



      rawData = Import["rawData.txt"];
      splitRawData = StringSplit[rawData, "%%"];
      dataIwant = splitRawData[[19]];
      FullForm[dataIwant];
      splitDataIntoRows = StringSplit[dataIwant, "n"];
      splitData1 = StringSplit[splitDataIntoRows, " "];


      I want to use this function to split the data into 6 columns.



      convertListToAssociation = 
      list [Function]
      AssociationThread["Time (kyr BP)", "Sea level (m)", "T_NH(deg C)",
      "T_dw (deg C)", "delta_w", "delta_T", list]


      What are further steps to be taken?










      share|improve this question









      $endgroup$




      My friend has this huge txt-log of sea levels. He wants to organize it into a dataset.



      After importing it this file a used StringSplit to separate it into rows, then to singular elements



      rawData = Import["rawData.txt"];
      splitRawData = StringSplit[rawData, "%%"];
      dataIwant = splitRawData[[19]];
      FullForm[dataIwant];
      splitDataIntoRows = StringSplit[dataIwant, "n"];
      splitData1 = StringSplit[splitDataIntoRows, " "];


      I want to use this function to split the data into 6 columns.



      convertListToAssociation = 
      list [Function]
      AssociationThread["Time (kyr BP)", "Sea level (m)", "T_NH(deg C)",
      "T_dw (deg C)", "delta_w", "delta_T", list]


      What are further steps to be taken?







      string-manipulation data dataset data-structures






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 6 hours ago









      Artem AnisimovArtem Anisimov

      342




      342




















          2 Answers
          2






          active

          oldest

          votes


















          3












          $begingroup$

          You actually should work with arrays in this case as the dataset is quite large. You can import the table in one go as follows.



          data = Import[
          "rawData.txt",
          "Table",
          "HeaderLines" -> 19
          ];

          columns = Transpose[Developer`ToPackedArray[N[data]]];


          I extracted only the data columns without column titles so that they can be stored in a packed array. This should speed up considerably further working with the data.






          share|improve this answer









          $endgroup$




















            2












            $begingroup$

            A slightly different approach is to split the data into lines first, then split each line into fields. Since we know the data begins on line 20, we can do this



            rawData = Import["rawData.txt", Path -> NotebookDirectory[]];
            textLines = StringSplit[rawData, "n"];
            dataIwant = ToExpression[StringSplit /@ textLines[[20 ;;]]];


            We used ToExpression to convert from text strings to numbers. Now we can put the numbers into an association. We probably want to use the first column, time, as our key, but floating point numbers are not good keys. So don't do this



            poor = Association @@ (First[#] -> Rest[#] & /@ dataIwant);
            poor[-39999.8]


            If you get the right answer, it was just luck. A better way to treat this data is to convert the time from floating point kiloyears to integer centuries. Then we can create a better association like this



            better = Association @@ (Round[10 First[#]] -> Rest[#] & /@ dataIwant);


            Now our keys are exact numbers, but we still want to use kiloyears, so we write a function that converts our time in kiloyears to centuries and rounds off for us, like this



            getData[kyr_] := better[Round[10 kyr]]
            getData[-3999.8123]

            (* 68.766, 27.806, 4.047, -1.184, 2.377 *)


            Alternate versions of getData could interpolate the data or just give specific columns.






            share|improve this answer









            $endgroup$













              Your Answer








              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "387"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader:
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              ,
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













              draft saved

              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f197834%2fconvert-a-huge-txt-file-into-a-dataset%23new-answer', 'question_page');

              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              3












              $begingroup$

              You actually should work with arrays in this case as the dataset is quite large. You can import the table in one go as follows.



              data = Import[
              "rawData.txt",
              "Table",
              "HeaderLines" -> 19
              ];

              columns = Transpose[Developer`ToPackedArray[N[data]]];


              I extracted only the data columns without column titles so that they can be stored in a packed array. This should speed up considerably further working with the data.






              share|improve this answer









              $endgroup$

















                3












                $begingroup$

                You actually should work with arrays in this case as the dataset is quite large. You can import the table in one go as follows.



                data = Import[
                "rawData.txt",
                "Table",
                "HeaderLines" -> 19
                ];

                columns = Transpose[Developer`ToPackedArray[N[data]]];


                I extracted only the data columns without column titles so that they can be stored in a packed array. This should speed up considerably further working with the data.






                share|improve this answer









                $endgroup$















                  3












                  3








                  3





                  $begingroup$

                  You actually should work with arrays in this case as the dataset is quite large. You can import the table in one go as follows.



                  data = Import[
                  "rawData.txt",
                  "Table",
                  "HeaderLines" -> 19
                  ];

                  columns = Transpose[Developer`ToPackedArray[N[data]]];


                  I extracted only the data columns without column titles so that they can be stored in a packed array. This should speed up considerably further working with the data.






                  share|improve this answer









                  $endgroup$



                  You actually should work with arrays in this case as the dataset is quite large. You can import the table in one go as follows.



                  data = Import[
                  "rawData.txt",
                  "Table",
                  "HeaderLines" -> 19
                  ];

                  columns = Transpose[Developer`ToPackedArray[N[data]]];


                  I extracted only the data columns without column titles so that they can be stored in a packed array. This should speed up considerably further working with the data.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 5 hours ago









                  Henrik SchumacherHenrik Schumacher

                  61.8k585172




                  61.8k585172





















                      2












                      $begingroup$

                      A slightly different approach is to split the data into lines first, then split each line into fields. Since we know the data begins on line 20, we can do this



                      rawData = Import["rawData.txt", Path -> NotebookDirectory[]];
                      textLines = StringSplit[rawData, "n"];
                      dataIwant = ToExpression[StringSplit /@ textLines[[20 ;;]]];


                      We used ToExpression to convert from text strings to numbers. Now we can put the numbers into an association. We probably want to use the first column, time, as our key, but floating point numbers are not good keys. So don't do this



                      poor = Association @@ (First[#] -> Rest[#] & /@ dataIwant);
                      poor[-39999.8]


                      If you get the right answer, it was just luck. A better way to treat this data is to convert the time from floating point kiloyears to integer centuries. Then we can create a better association like this



                      better = Association @@ (Round[10 First[#]] -> Rest[#] & /@ dataIwant);


                      Now our keys are exact numbers, but we still want to use kiloyears, so we write a function that converts our time in kiloyears to centuries and rounds off for us, like this



                      getData[kyr_] := better[Round[10 kyr]]
                      getData[-3999.8123]

                      (* 68.766, 27.806, 4.047, -1.184, 2.377 *)


                      Alternate versions of getData could interpolate the data or just give specific columns.






                      share|improve this answer









                      $endgroup$

















                        2












                        $begingroup$

                        A slightly different approach is to split the data into lines first, then split each line into fields. Since we know the data begins on line 20, we can do this



                        rawData = Import["rawData.txt", Path -> NotebookDirectory[]];
                        textLines = StringSplit[rawData, "n"];
                        dataIwant = ToExpression[StringSplit /@ textLines[[20 ;;]]];


                        We used ToExpression to convert from text strings to numbers. Now we can put the numbers into an association. We probably want to use the first column, time, as our key, but floating point numbers are not good keys. So don't do this



                        poor = Association @@ (First[#] -> Rest[#] & /@ dataIwant);
                        poor[-39999.8]


                        If you get the right answer, it was just luck. A better way to treat this data is to convert the time from floating point kiloyears to integer centuries. Then we can create a better association like this



                        better = Association @@ (Round[10 First[#]] -> Rest[#] & /@ dataIwant);


                        Now our keys are exact numbers, but we still want to use kiloyears, so we write a function that converts our time in kiloyears to centuries and rounds off for us, like this



                        getData[kyr_] := better[Round[10 kyr]]
                        getData[-3999.8123]

                        (* 68.766, 27.806, 4.047, -1.184, 2.377 *)


                        Alternate versions of getData could interpolate the data or just give specific columns.






                        share|improve this answer









                        $endgroup$















                          2












                          2








                          2





                          $begingroup$

                          A slightly different approach is to split the data into lines first, then split each line into fields. Since we know the data begins on line 20, we can do this



                          rawData = Import["rawData.txt", Path -> NotebookDirectory[]];
                          textLines = StringSplit[rawData, "n"];
                          dataIwant = ToExpression[StringSplit /@ textLines[[20 ;;]]];


                          We used ToExpression to convert from text strings to numbers. Now we can put the numbers into an association. We probably want to use the first column, time, as our key, but floating point numbers are not good keys. So don't do this



                          poor = Association @@ (First[#] -> Rest[#] & /@ dataIwant);
                          poor[-39999.8]


                          If you get the right answer, it was just luck. A better way to treat this data is to convert the time from floating point kiloyears to integer centuries. Then we can create a better association like this



                          better = Association @@ (Round[10 First[#]] -> Rest[#] & /@ dataIwant);


                          Now our keys are exact numbers, but we still want to use kiloyears, so we write a function that converts our time in kiloyears to centuries and rounds off for us, like this



                          getData[kyr_] := better[Round[10 kyr]]
                          getData[-3999.8123]

                          (* 68.766, 27.806, 4.047, -1.184, 2.377 *)


                          Alternate versions of getData could interpolate the data or just give specific columns.






                          share|improve this answer









                          $endgroup$



                          A slightly different approach is to split the data into lines first, then split each line into fields. Since we know the data begins on line 20, we can do this



                          rawData = Import["rawData.txt", Path -> NotebookDirectory[]];
                          textLines = StringSplit[rawData, "n"];
                          dataIwant = ToExpression[StringSplit /@ textLines[[20 ;;]]];


                          We used ToExpression to convert from text strings to numbers. Now we can put the numbers into an association. We probably want to use the first column, time, as our key, but floating point numbers are not good keys. So don't do this



                          poor = Association @@ (First[#] -> Rest[#] & /@ dataIwant);
                          poor[-39999.8]


                          If you get the right answer, it was just luck. A better way to treat this data is to convert the time from floating point kiloyears to integer centuries. Then we can create a better association like this



                          better = Association @@ (Round[10 First[#]] -> Rest[#] & /@ dataIwant);


                          Now our keys are exact numbers, but we still want to use kiloyears, so we write a function that converts our time in kiloyears to centuries and rounds off for us, like this



                          getData[kyr_] := better[Round[10 kyr]]
                          getData[-3999.8123]

                          (* 68.766, 27.806, 4.047, -1.184, 2.377 *)


                          Alternate versions of getData could interpolate the data or just give specific columns.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered 4 hours ago









                          LouisBLouisB

                          4,6991717




                          4,6991717



























                              draft saved

                              draft discarded
















































                              Thanks for contributing an answer to Mathematica Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid


                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.

                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f197834%2fconvert-a-huge-txt-file-into-a-dataset%23new-answer', 'question_page');

                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Log på Navigationsmenu

                              Creating second map without labels using QGIS?How to lock map labels for inset map in Print Composer?How to Force the Showing of Labels of a Vector File in QGISQGIS Valmiera, Labels only show for part of polygonsRemoving duplicate point labels in QGISLabeling every feature using QGIS?Show labels for point features outside map canvasAbbreviate Road Labels in QGIS only when requiredExporting map from composer in QGIS - text labels have moved in output?How to make sure labels in qgis turn up in layout map?Writing label expression with ArcMap and If then Statement?

                              Nuuk Indholdsfortegnelse Etyomologi | Historie | Geografi | Transport og infrastruktur | Politik og administration | Uddannelsesinstitutioner | Kultur | Venskabsbyer | Noter | Eksterne henvisninger | Se også | Navigationsmenuwww.sermersooq.gl64°10′N 51°45′V / 64.167°N 51.750°V / 64.167; -51.75064°10′N 51°45′V / 64.167°N 51.750°V / 64.167; -51.750DMI - KlimanormalerSalmonsen, s. 850Grønlands Naturinstitut undersøger rensdyr i Akia og Maniitsoq foråret 2008Grønlands NaturinstitutNy vej til Qinngorput indviet i dagAntallet af biler i Nuuk må begrænsesNy taxacentral mødt med demonstrationKøreplan. Rute 1, 2 og 3SnescootersporNuukNord er for storSkoler i Kommuneqarfik SermersooqAtuarfik Samuel KleinschmidtKangillinguit AtuarfiatNuussuup AtuarfiaNuuk Internationale FriskoleIlinniarfissuaq, Grønlands SeminariumLedelseÅrsberetning for 2008Kunst og arkitekturÅrsberetning for 2008Julie om naturenNuuk KunstmuseumSilamiutGrønlands Nationalmuseum og ArkivStatistisk ÅrbogGrønlands LandsbibliotekStore koncerter på stribeVandhund nummer 1.000.000Kommuneqarfik Sermersooq – MalikForsidenVenskabsbyerLyngby-Taarbæk i GrønlandArctic Business NetworkWinter Cities 2008 i NuukDagligt opdaterede satellitbilleder fra NuukområdetKommuneqarfik Sermersooqs hjemmesideTurist i NuukGrønlands Statistiks databankGrønlands Hjemmestyres valgresultaterrrWorldCat124325457671310-5