For data scientists, Microsoft launches tools that can process tabular data in PythonData WranglerData Wrangler is an extension of the VS Code Insiders version. It can perform data preparation, cleaning and visualization, assist users in identifying and repairing data errors, analyze the quality of data, and convert data into the required format.
Data Wrangler’s built-in conversion and visualization function library, when the user performs data changes, the expansion package will automatically use the open source Python function library to generate code for the user’s data operation, which means that the user can more quickly and accurately Write data preparation programs.
Since the quality of data will directly affect the quality of model predictions, data scientists usually need to spend a lot of time preparing data. In the process of exploring data, data scientists need to write many small code fragments to delete data rows or remove missing For numerical values, Microsoft mentioned that there is currently a lack of tools to simplify data preparation. Data scientists often need to search for code snippets on Stack Overflow and copy and paste them into the program.
Data Wrangler’s interactive user interface can quickly generate code for users. When users view and visualize Python data analysis module Pandas data frames (Dataframes), Data Wrangler can generate code for target operations. For example, users only need to Right-click on a data row header and delete it, and Data Wrangler can automatically generate Python code to do this.
In addition, when data scientists want to create a new derived data row from the data row of the Pandas data frame, the process of writing custom code is prone to errors, and Data Wrangler allows users to provide an output example and tell Data Wrangler that they want to derive data In the data form of the column, the extension kit will write Python code through the artificial intelligence synthesis technology PROSE.
Now data scientists want to use Data Wrangler in VS Code Insiders, they can download it directly from the application market, and start Data Wrangler from the Pandas data frame of Jupyter Notebook, or in VS Code Insiders, choose to open CSV or Parquet archives.