10 Commits

Author SHA1 Message Date
Ching L
6d1fffb63d feat(crawler): add main category support for better classification
- Add CATEGORY_MAPPING dictionary to map sub-categories to main categories
- Implement get_main_category function to find parent category
- Include main_category field in article data structure
- Update toot function to display both main and sub categories intelligently
- Avoid duplication when main category is the same as sub category
2025-12-09 10:58:01 +08:00
Ching L
3bbe483c64 feat(crawler): add cloudscraper to bypass Cloudflare protection
- Replace requests with cloudscraper for image downloading
- Update log file path to use home directory logs
- Add timeout parameter for image requests to prevent hanging
2025-12-05 21:07:40 +08:00
Ching L
da1969b103 fix(crawler): replace print statements with logger for better logging
Updated the crawler to use the logger for outputting article information and toot notifications, enhancing the logging mechanism for improved monitoring and debugging.
2025-04-07 11:33:28 +08:00
Ching L
15addaba24 feat(crawler): update crawler to use RSS feed for article retrieval
Replaced HTML scraping with RSS feed parsing to fetch article details including title, URL, author, date, category, content, and image link. This improves reliability and efficiency in gathering articles from the source.
2025-04-07 11:32:00 +08:00
Ching
c5bf60858c refactor: 修改获取短链逻辑 2024-04-17 10:18:17 +08:00
Ching
a4c7f76216 refactor: Add URL shortening functionality 2024-04-16 14:12:58 +08:00
Ching
c0533a4772 feat(crawler): 修改嘟文格式
修改嘟文格式

Signed-off-by: Ching <loooching@gmail.com>
2023-07-17 14:34:03 +08:00
Ching
129df366ed feat(crawler): 增加 logger,修改发送逻辑
增加 logger,修改发送逻辑

Signed-off-by: Ching <loooching@gmail.com>
2023-07-17 11:49:05 +08:00
Ching
5dfbfa5c57 feat(crawler): 增加 chh 爬虫函数和发嘟嘟函数
增加 chh 爬虫函数和发嘟嘟函数

Signed-off-by: Ching <loooching@gmail.com>
2023-07-16 21:20:04 +08:00
a8de1b5643 Initial commit 2023-07-16 18:47:14 +08:00