July 30, 2008

Removing Flash-based content from feeds with Yahoo Pipes

Yahoo Pipes

I have not written about Yahoo Pipes in detail on this blog yet, but I thought this was interesting so I am posting about it. To quickly summarize, Yahoo Pipes allows you to produce feeds or widgets from web-accessible sources in several formats. I find it very convenient to aggregate and filter feeds.


Recently, I wanted to remove some Flash-based content from a feed that I follow since I am unable to see it in my reader anyway. I was trying to use the Regex module to match the tags for the content and replace them with a note stating that they were removed. Surprisingly, many of my regular expressions I was entering were not working and the Flash-based content remained in the feed. After testing quite a few expressions I finally found one that worked. Here it is:

[<][^<]*application/x-shockwave-flash[^>]*[>][<]/[a-z]*[>]

I still do not understand what was wrong with the other expressions I tried. Most of them were simpler and more specific than this one, and they should have matched. Anyway, this works for now.

UPDATE: Wired recently published an introduction to Pipes article

No comments: