Topic Modeling
Topic Metadata
Pandas dataframe where each row corresponds to a unique topic. Metadata associated with each topic includes:
- topic depth
- a human-readable topic description (topic label)
- identifying keywords that differentiate the topic from other topics
# Returns a Pandas df
print(map.topics.metadata)
depth topic_id topic_depth_1 topic_description topic_short_description topic_depth_2 topic_depth_3
0 1 0 Women's Fashion (3) women/tops/dress/sandals/womens/casual/shoes/p... Women's Fashion (3) NaN NaN
1 1 1 Electronics (5) USB/Bluetooth/iPhone/charging/Intel/cable/HDMI... Electronics (5) NaN NaN
2 1 2 Jewelry Collection (2) jewelry/IceCarats/Jewelry/Type/ICECARATS/Sterl... Jewelry Collection (2) NaN NaN
3 1 3 Phone Protector phone/Galaxy/Samsung/dogs/Watch/protector/scre... Phone Protector NaN NaN
4 1 4 Pool Supplies Pool/pool/Floats/chair/Brand/Amazon/Lathe/floa... Pool Supplies NaN NaN
... ... ... ... ... ... ... ...
605 3 507 Lighting Replacement hose/garden/Garden/watering/Hose/ft/plants/Jet... Garden Hose Plumbing S... Garden Hose
606 3 508 Lighting Replacement Rate/9930/gallons/207/months/38℃/125PSI/GPM/34... Water Pump Plumbing S... Water Pump
607 3 509 Lighting Replacement NPT/¼/½/PSI/Pump/Straight/tire/pump/12V/Connec... Tire Pump Plumbing S... Tire Pump
608 3 510 Lighting Replacement drain/Drain/sink/pipe/Sink/stopper/steel/toile... Plumbing Fixtures Plumbing S... Plumbing Fixtures
609 3 511 Lighting Replacement shower/water/Shower/filter/solar/fountain/head... Bathroom Essentials Plumbing S... Bathroom Essentials
610 rows × 7 columns
Topic Hierarchy
Learn more about your topic breakdown as a Python dictionary. What are the most general topics, and which sub-topics do they contain?
# map.topics.hierarchy is a dict
hierarchy = map.topics.hierarchy
print(f'Your depth 1 (most general) topics are: {hierarchy.keys()}')
Your depth 1 (most general) topics are: dict_keys([
("Women's Fashion (3)", 1),
('Electronics (5)', 1),
('Jewelry Collection (2)', 1),
...
])
You can use higher-level topic keys to access lower-level topics in your hierarchy.
import random
# List the subtopics in a random top-level topic
random_topic_1 = random.choice(list(hierarchy.keys()))
print(f'The general topic {random_topic_1} contains subtopics {hierarchy[random_topic_1]}')
The general topic ('Footwear (14)', 2) contains subtopics [
'Shoes (3)', 'Sandal', 'Sneaker Culture', ..., "Women's Sandals"
]
Topic Groups
By providing a level of hierarchy, get a list of dictionaries where each item is a distinct topic at that level.
Keys for that topic include subtopics, subtopic_ids, topic_id, topic_short_description, topic_long_description, and datum_ids.
your_depth_level = 2
print(map.topics.group_by_topic(your_depth_level)[0])
{
'subtopics': ['Miscellaneous (3)'],
'subtopic_ids': [87],
'topic_id': 16,
'topic_short_description': 'Audio Equipment (3)',
'topic_long_description': 'Bluetooth/speaker/Speaker/music/CarPlay/MP3/prevention/bluetooth/stereo/sound/karaoke/Loss/Radio/⭐/radio',
'datum_ids': {'61c', '/WM', 'Rsw', 'q6I', ..., 'AVjU'}
}