TODO: NLP POS tagger for english and foreign language newspapers to detect proper names in native language

http://nlp.stanford.edu/software/tagger.shtml
http://www.inkdrop.net/news/
http://www.onlinenewspapers.com/
https://en.wikipedia.org/wiki/List_of_newspapers_in_the_world_by_circulation

considering a java version of NLP POS tagging to document prevalence of proper names in high volume circulation newspapers in various languages, several processable by the stanford nlp software available, also persistently storable in original language. the stanford POS tagger is available in arabic, chinese, french, spanish, & german

Advertisements

2 responses to this post.

  1. i am interested in documenting information such as: first occurences, keyword collocation, avoiding translation & maintaining data on circulated media in original language

    Reply

  2. performing a SWOT analysis of proper names appearing in arabic newspapers such as in this listing: http://www.w3newspapers.com/arabic/ in a relational db maintaining a table of proper names along with metadata such as article listings, keyword occurrences, & links to longblob or structured documents such as office spreadsheets or longer detailed documents would be an interesting way to maintain awareness of current topics in arab culture

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: