Paxata - an excellent tool to treat text
Overall Satisfaction with Paxata
Paxata is used in our business to accelerate the business process of cleaning, enrichment and preparation of data to be fed into BI dashboards to drive insights business decisions. It is being used by multiple verticals in the analytics as well as the risk practice right now to service clients.
Pros
- Visualize distributions in large data sets effectively which enable the user to quickly spot outliers and treat them appropriately
- Provides recommendation to merge datasets based on matching column values
- The cluster and edit feature in my opinion is its most powerful feature and reduces cardinality in column with text
Cons
- Doesn't provide recommendation on how to impute values
- There is a lag quite often
- We can say whether a column has errors or quality issues in the first look
- It saves time to clean data
- It reduces the requirement of too many data engineer/stewards and hence adds positive impact on the return of the business
- Talend Data Preparation
Paxata is a much better tool when it comes to handling natural language but Talend provides recommendations on how to impute missing values and outliers. Paxata provides recommendations on dataset tie-ups and joins but Talend doesn't provide any such recommendations. In paxata you can visualize distribution of data in a column and filter them by dragging and selecting the section you'd like to retain.
Comments
Please log in to join the conversation