Skip to content

Self containment metric#216

Merged
andresmondragont merged 5 commits intomainfrom
caroline-metrics
Mar 3, 2026
Merged

Self containment metric#216
andresmondragont merged 5 commits intomainfrom
caroline-metrics

Conversation

@carolineychen8
Copy link
Contributor

No description provided.

return rog.reset_index(name='rog')

def self_containment(stops, threshold, agg_freq='d', weighted=True, home_activity_type='Home',
activity_type_col='activity_type', traj_cols=None, time_weights=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's assume this information is in the "location_id" column. Which is passed as kwargs or in traj_cols similar to home_attribution.

home_activity_type can be "home_id", defaulting to home, but can take any location id.

non_home = g[g[activity_type_col] != home_activity_type]

if len(non_home) == 0:
return np.nan # No non-home activities
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is NO activity, then np.nan is appropriate. But if there are stops at home and no non-home, then the containment should be 1.0


# Check if all distances are NaN (no home location found)
if non_home['dist_from_home'].isna().all():
return np.nan
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overly-defensive and fails silently. Dist from home should never be np.nan for any of the stops and, thus, we should just let it fail.

'2024-01-01 02:30',
'2024-01-01 03:30'
]),
'activity_type': ['Work', 'Shopping', 'Restaurant'],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If none of the stops correspond to home_id, maybe raise a warning when the WHOLE stop table does not contain a home.


# Should return NaN when there are no non-home activities
assert len(result) == 1
assert pd.isna(result['self_containment'].values[0])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self containment is expected to be 1.0 not na

@andresmondragont andresmondragont self-requested a review February 27, 2026 20:50
@andresmondragont
Copy link
Contributor

Failing Test:

  • test_from_df_values_types_na
===================================================================== FAILURES ======================================================================
___________________________________________________________ test_from_df_values_types_na ____________________________________________________________

value_na_input_df =   user_id     timestamp  latitude  longitude              datetime
0       A  1672560000.0   40.7128    -74.006  2023-...2617600.0   36.1699  -115.1398  2023-01-01T18:00:00Z
4       C  1672531200.0   40.7128    -74.006  2023-01-01T00:00:00Z
expected_value_na_output_df =   user_id   timestamp  latitude  longitude            datetime  tz_offset
0       A  1672560000   40.7128    -74.006 2...699  -115.1398 2023-01-01 18:00:00          0
4       C  1672531200   40.7128    -74.006 2023-01-01 00:00:00          0

    def test_from_df_values_types_na(value_na_input_df, expected_value_na_output_df):
        """Tests from_df for correct values, type casting, and NA handling."""
        traj_cols = {
            "user_id": "user_id", "timestamp": "timestamp", "latitude": "latitude",
            "longitude": "longitude", "datetime": "datetime",
        }
        mixed_tz = 'naive'
    
        result = loader.from_df(value_na_input_df.copy(), traj_cols=traj_cols,
                                parse_dates=True, mixed_timezone_behavior=mixed_tz, sort_times=False)
    
>       pd.testing.assert_frame_equal(result, expected_value_na_output_df, check_dtype=True)
E       AssertionError: Attributes of DataFrame.iloc[:, 4] (column name="datetime") are different
E       
E       Attribute "dtype" are different
E       [left]:  datetime64[us]
E       [right]: datetime64[ns]

nomad/tests/test_io.py:249: AssertionError

Copy link
Contributor

@andresmondragont andresmondragont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix failing test: test_from_df_values_types_na

@andresmondragont andresmondragont added the bug Something isn't working label Mar 1, 2026
@andresmondragont andresmondragont removed the bug Something isn't working label Mar 1, 2026
@andresmondragont andresmondragont merged commit 1f384ad into main Mar 3, 2026
@paco-barreras paco-barreras deleted the caroline-metrics branch March 3, 2026 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants