This blog is all about an orange tool for data mining. We can do a lot of stuff with the help of the orange tool like visual programming, data visualization, data exploration, data mining, etc… The orange tool is free and open-source and you can install it very easily on any os.
Orange is an open-source data visualization, machine learning, and data mining toolkit. It features a visual programming front-end for explorative data analysis and interactive data visualization, and can also be used as a Python library.
Classifications: data visualization, machine lea…
Programming languages used: Python
Developer: Open-source software
For windows download the .exe file from here.
For Linux do check out this amazing blog here.
the black canvas of orange where you will do all your data exploration. On the left-hand side, you can see there is a total of 5 sections and that all 5 sections contain different-different widgets which we will use in the future for data exploration.
Check Out the Widgets Catalog of Orange tool here.
Workflows in Orange resemble actual optical systems; setting up a simulation in this way is intuitive and easy to inspect and modify. Passing data from one widget to another imposes no additional overhead in terms of CPU time.
we can see that how the workflow will use and connect with another with the workflow.
Now we learn about how to step-by-step process of creating the workflow.
In the image you can see by default data set will be available in the file sections like iris.tab,heart_disease.tab, etc. Also, we can upload our data set from the local pc as well as fetch the data from the URL.
Now after the load the data we have created Scatter Plot for our data. Below image you can see the Scatter plot for iris data set.
Here I load Classification Tree Workflow.
After loading workflow you can see many widgets are connected with each other. We can also modify the widget as per our needs.
Classification Tree workflow uses to explore the classification of data using Decision tree methods. Let’s see the classification tree for Iris Dataset.
After that I have created bar chart using Distribution. Below image you can see that things.
Here I have created work flow with the data table. In Image you can see the data table for iris data set.
Distribution: It is used to getting information about the distribution of data.
Scatter Plot: Used to visualize data using scatter plot
Bar Plot: It will represent the data into bars. It is a very simple & basic plot.
Linear Projection: In this widget, you can visualize the data up to 3D. For higher dimensions, it will project the data on a linear plane.
Now we workflow with the confusion matrix and you can see in the image of the confusion matrix for the iris data set.
I hope now you can work by yourself in the orange tool. I tried to cover as many things as I can. Now you can explore more by yourself.
Explore more about the Orange tool here.