In this post, we will see how to resolve Detect immediate parent rows of each row and add their values to a new column
Question:
I have a dataframe that contains multiple hierarchy columns and a value attach to them, something like this:location hierarhcy | industry hierarchy | value | |
---|---|---|---|
0 | US | All | V1 |
1 | US-California | All | V2 |
2 | US-California | All-HT | V3 |
3 | US-California-LA | All | V4 |
4 | US-California-LA | All-HT | V5 |
5 | US-California-LA | All-HT-SoftWare | V6 |
6 | US-California-LA | All-Manufacturing | V7 |
Now I want to find the immediate parents of each row and add their value a a new column. My example just shows two parents but it could be more. So, the expected output would be as follows:
location hierarhcy | industry hierarchy | value | value-parent-1 | value-parent-2 | |
---|---|---|---|---|---|
0 | US | All | V1 | NA | NA |
1 | US-California | All | V2 | V1 | NA |
2 | US-California | All-HT | V3 | V2 | NA |
3 | US-California-LA | All | V4 | V2 | NA |
4 | US-California-LA | All-HT | V5 | V3 | V4 |
5 | US-California-LA | All-HT-SoftWare | V6 | V5 | NA |
6 | US-California-LA | All-Manufacturing | V7 | V4 | NA |
My Dataframe could be extremely large, so the speed and memory are important factors. So, row-wise process won’t really work.
Logic:
The immediate parents for a row will be all rows that are one level higher than one of the hierarchies. For example for row 5, the potential immediate parents will be:
– | location hierarhcy | industry hierarchy |
---|---|---|
0 | US-California | All-HT-SoftWare |
1 | US-California-LA | All-HT |
This is the code to create the dataframe if you want to play with it.
Best Answer:
Here you are:If you have better answer, please add a comment about this, thank you!
Source: Stackoverflow.com
Leave a Review