Thanks to the Pandas library in Python, data manipulation and comparison can be possible with only a few lines of code. Today's code pill is about comparing two similar CSV files with only a couple of lines.
# Access all lines where first contains same with second csv result_same = df_first[df_first.apply(tuple, 1).isin(df_second.apply(tuple,1))] print("Same") print(result_same)
# Use Tilde to access all lines where first doesn't contain second csv result_diff = df_first[~df_first.apply(tuple, 1).isin(df_second.apply(tuple,1))] print("Difference") print(result_diff)
if __name__ == '__main__': compare_csv("first.csv", "second.csv")
We use isin function of pandas to search for similarities between two data frames and to find differences we use Tilde(~) operator.
Result:
result
In this code pills, we learned how to compare 2 similar CSV files. See you at the next code pills.