Caching the News

When the user inputs a RSS URL, and no date is entered, the rss-reader fetches the feed items from the specified source and prints it in normal or json format, based on the options selected. While doing this, it also caches the read news.

The utility caches the feeds data as follows: When a feed is read, a dictionary of the feed’s information is created, storing its title, date, content, news link and image’s link, the RSS source and a path to the feed’s cache directory. The utility creates a cache directory in the cached_news folder for each feed. In the feed’s directory, the article of the feed from its news page is downloaded in a text file, the links in that article are extracted and stored in a text file and the images in the article are downloaded in another directory named “images” in the feed’s directory. This is done for when the utility wants to convert the feeds into HTML or PDF.

Then for each feed, a tuple is constructed, first element being the news date and the second one the previously mentioned dictionary and all tuples each corresponding to one feed are stored in a list that is saved in a file in the cached_news directory. The cached news are fetched by the news date, hence this implementation is designed that is demonstrated in the image below.

caching structure

The cached_news directory would look like this:

cached_news directory structure

And inside each feed’s directory, would look like this:

feed directory structure