步骤:
1.准备utf-8编码的文本文件file
2.通过文件读取字符串 str
3.对文本进行预处理
4.分解提取单词 list
5.单词计数字典 set , dict
6.按词频排序 list.sort(key=)
7.排除语法型词汇,代词、冠词、连词等无语义词
8.输出TOP(20)
一、.英文歌曲 词频统计
Python
str2I'm undefeatedJumpiing out of my skin, pull the chordYeah I believe itThe past, is everything we weredon't make us who we areSo I'll dream, until I make it real,and all I see is starsIts not until you fall that you flyWhen your dreams come alive you're unstoppableTake a shot, chase the sun, find the beautifulWe will glow in the dark turning dust to goldAnd we'll dream it possiblepossibleAnd we'll dream it possibleI will chase, I will reach, I will flyUntil I'm breaking, until I'm breakingOut of my cage, like a bird in the nightI know I'm changing, I know I'm changingIn, into something big, better than beforeAnd if it takes, takes a thousand livesThen it's worth fighting forIts not until you fall that you flyWhen your dreams come alive you're unstoppableTake a shot, chase the sun, find the beautifulWe will glow in the dark turning dust to goldAnd we'll dream it possibleit possibleFrom the bottom to the topWe're sparking wild fire'sNever quit and never stopThe rest of our livesFrom the bottom to the topWe're sparking wild fire'sNever quit and never stopIts not until you fall that you flyWhen your dreams come alive you're unstoppableTake a shot, chase the sun, find the beautifulWe will glow in the dark turning dust to goldAnd we'll dream it possiblepossiblelowerstr2 str2replacestr2 str2replacestr2str2 str2stripstr2 str2splitstr2 word str2 wordstr2countwordstrSetstr2newSetstrSet1strSetnewSetstrSet1strdict word strSet1 strdictword str2countwordstrdictstrdictstrList strdictitems elem elemstrListsortkeytakesecondreversestrList i strListi
2.中文小说 词频统计
Python
jiebaf encodingfofreadfclosefodoupols jiebalcutfodoupodict word doupols word doupodictworddoupodictgetworddoupodictjiebacutfo jiebacutfocut_all jiebacut_for_searchfo wcList doupodictitemswcListsortkey xxreverse wcList i wcList
还没有评论,来说两句吧...