Which of the following represent data types that can be joined together Alteryx

I was a little confused as to which of these tools to use and so I assume other people may be a little confused too. But have no fear.. the difference is straightforward and easy to understand.

The join tool matches selected incoming fields from the L and R inputs. The J output will show the rows that do match, the R output shows the rows from the R input that could not be matched with the L input, and vice versa.

We have three options for joining datasets via the join tool:

  • Single field
  • Multiple fields
  • Record positions

We might want to join by multiple fields if, for example, we have a list of first names and last names featuring repeating first and or last names. We only want to join the unique combinations of first and last names, as many people share first names or last names, as per the example below:

Which of the following represent data types that can be joined together Alteryx

By joining on first name and last name we eliminate unnecessary duplicates.

Joining by record position will stick the R dataset onto the L dataset, matching each row simply by their positions.

Append fields

The Append field performs a cartesian join on both incoming datasets; joining every row of the S (source) table onto every row of the T (target) table, as shown below.

Which of the following represent data types that can be joined together Alteryx

The number of joined output records will be (number of records in S table) x (number of records in T table). This must be approached with caution as we can potentially end up with very large datasets that are slow to process and take up a lot of space.

Similarities:

  • Both tools will make the dataset wider
  • With both tools we can reorder fields
  • Select and deselect fields
  • Modify the data type
  • Rename fields

Differences:

  • The Append fields tool has one outlet.
  • Join tool has three. In the J output it will only output those fields that match each other based on selected fields in configuration.
  • The join tool is case sensitive, where as case does not matter in append fields.

Which of the following represent data types that can be joined together Alteryx

Which of the following represent data types that can be joined together Alteryx

You just got access to your dataset and you are ready to start spraying out cool analytics. Especially if you have a good idea about how your data is looking, it is tempting to skip exploring the data before diving head first into it. 

Alteryx is full of tools that make an EDA(Exploratory Data Analysis) too easy to skip. By just spending 5 minutes looking into your data you might find errors or relations in your dataset that you would otherwise have missed. This is meant as a basic guide to a quick and general EDA and hopefully helps you avoid getting lost.

Which of the following represent data types that can be joined together Alteryx

An EDA will help you: 

  • Find missing data, wrong data types and data sizes
  • Find outliers that might or might not be explainable 
  • Find new relations between parameters 
  • Find general trends in the data

Keep it simple – use the result window 

The result window is always in your view and is a great way to quickly check your data. In the results window, the data quality bar can be viewed above each column. Colours indicate what percentage of the column contains OK values(green), empty values(Grey), Null values(yellow), and not OK values(red). The specific meaning of each colour can be seen here. The Data Cleansing tool is great at cleaning your data. Insert this tool to remove leading or trailing white spaces, remove null rows and columns, and for other quick cleaning methods. 

Clicking the ‘Metadata’ button in the result window will show you the datatype, size, source, and description of each column. A too small column size or wrong datatype will mess with your analysis sooner or later; make it a habit to check this as the first thing. To quickly change these values use the Select tool.

Which of the following represent data types that can be joined together Alteryx

Browse Tool – Data Profile 

Probably the most used tool in Alteryx for a reason. Apart from looking at the actual data points themselves, the browse tool is great at giving an overview of your data. In the Browse tool configuration window, a profile of the data is available. The data quality bar is also available here together with the distribution of values in each column. By clicking on the name of a column in the configuration window, a more detailed profile of the specific column will appear.  Depending on the datatype different statistical values such as mean, standard deviation, max, min, average, shortest value, longest value, ect will be displayed. Here it is easy to check whether the min, max and other values make sense and to further investigate if they don’t. The view will also give you a histogram of your data. Maybe you can find a nice bell curve somewhere in your data. For a fancy histogram, a violin plot can be generated using the ‘Violin Plot’ tool. 

Which of the following represent data types that can be joined together Alteryx

Other tools in the Data Investigation category that will help you gain similar insight are Basic Data Profiling, Field Summary, and Frequency Table. 

Look at the numbers

Which of the following represent data types that can be joined together Alteryx

Looking at insight for individual columns is important, however, seeing how different columns relate to each other can be just as insightful. With the Association Analysis tool, a correlation matrix can easily be generated for numerical columns. Here a correlation coefficient close to 1 indicates a high positive correlation, 0 indicates no correlation and a correlation coefficient close to -1 indicates a high negative correlation. If you are using your data for data modelling you might have to remove columns that are correlated to improve the model and reduce the overall complexity. To figure out how to reduce the dimensionality of your data multiple methods can be used, some of them can be found here.

Which of the following represent data types that can be joined together Alteryx

For correlated columns, the Scatter Plot tool can be used to gain a quick visual insight into how the values are correlated and general trends in the data. You might find that parameters that you thought would be linearly correlated are actually exponentially correlated. In the scatter plot it is possible to visually detect outliers. Many definitions of outliers exist, with one of them being a point that is more than 3 standard deviations away from the mean. The standard deviation and mean for a column can be found with the browse tool. Whether data points should be removed depends on the specific case, however, it should always be justified in the analysis.

Which of the following represent data types that can be joined together Alteryx

By just spending five minutes adding a few tools to your workflow you will easily be able to find insights that you might otherwise have missed and avoid frustration further into your analysis. The methods discussed are all a part of a simple EDA, with more advanced EDA methods existing. However, for most cases the tools discussed should be more than enough to clean your data and find new insights, enabling you to make an efficient and thorough analysis. 


When you run into problems make sure to visit the Alteryx Community where it is almost certain that someone else has had the same problem as you. If you have any questions or comments please reach out to me.

What data types can be joined together in Alteryx?

Join Tool.
String fields can only be joined to other string fields..
Numeric fields can only be joined to other numeric fields..
Boolean fields can only be joined to other boolean fields..
DateTime fields types can only be joined to their exact type..
Spatial fields cannot be joined, use the Spatial Match Tool instead..

Which two options can be used to perform a join in Alteryx?

Select how to perform the Join. The 2 options are: by record position or by a specific field (column). Join by Record Position: Select this option when the two tables to be joined have the same field structure, and the data will be joined by its position within the two tables.

What is join multiple in Alteryx?

The Join Multiple tool combines two or more inputs based on a commonality between the input tables. By default, the tool outputs a full outer join. See Join Tool.

Which tools in the join category allow the data type and size of the fields to be changed?

You can also use the Select tool to modify the type and size of data, rename a column, add a description, and more. Other tools that include the functionality found in the Select tool are Append Fields, Find Nearest, Join, Join Multiple, Select In-DB, and Spatial Match.