python实现的使用huffman编码对文本的压缩与解压

大小: 69KB

文件类型: .rar

金币: 2

下载: 0 次

发布日期: 2021-06-03
语言: Python
标签: huffman 压缩解压 python

高速下载

资源简介

python版本为2.7.9，大家注意别下错了，里面有一个txt文件是进行压缩的，可以更改文件中的变量path1来对其他文件进行压缩与解压，代码中有详细注释，实现过程虽然简单，但是包含自己很多一些独特的想法，自己的知识产权，所以可能贵点，谢谢大家！

资源截图

小图大图

代码片段和文件信息

#coding:utf-8
#python version:2.7.9
#邮箱：545989326@qq.com
#goal:用Huffman进行文件的压缩和解压
import types
def get_code（treestrcode）:          #使用递归的方法从树中得到某一个字符的Huffman编码
    if type（tree）==type（‘1‘）:
        return False
    if tree[‘0‘]==str:
        code.append（‘0‘）
        return True
    elif tree[‘1‘]==str:
        code.append（‘1‘）
        return True
    else:
        if get_code（tree[‘0‘]strcode）:
            code.append（‘0‘）
            return True
        if get_code（tree[‘1‘]strcode）:
            code.append（‘1‘）
            return True
def get_huffman_code（string）:
    char_set ={}
    for char in string:
        if char not in char_set:
            char_set[char]=1
        else:
            char_set[char]+=1
    result_d={}
    for key in char_set:                 #将键与值的位置反互换一下，有利于每次的排序
        if char_set[key] not in result_d:
            result_d[char_set[key]]=key
        else:
            while char_set[key]  in result_d:
                char_set[key]+=0.01
            result_d[char_set[key]]=key

    while len（result_d）!=1:             #进行两两合并
        char_set_tmp = sorted（result_d.iteritems（）key=lambda asd:asd[0]reverse=False）
        x = char_set_tmp[0][0]+char_set_tmp[1][0]
        d = {‘0‘:char_set_tmp[0][1]‘1‘:char_set_tmp[1][1]}
        result_d.pop（char_set_tmp[0][0]）
        result_d.pop（char_set_tmp[1][0]）
        if x not in result_d:          #防止合并之后的数重复
            result_d[x]=d
        else:
            while x  in result_d:
                x+=0.01
            result_d[x]=d
    for key in result_d:   
        result_d =  result_d[key]
    result={}
    for key in char_set:
        
        code=[]
        get_code（result_dkeycode）
        str=‘‘
        n=len（code）-1
        while n>=0:
            str+=code[n]
            n-=1
        result[key]=str
    return result
def get_new_file（stringresultpath1）:
    string_new=‘‘                                          #二进制串
    for char in string:
        string_new+=result[char]
    remain_num=7-（len（string_new）%7）                       #添加位数使得其可以整除7，然后再添加一个添加的位数的数目的一个7位的bit
    if len（string_new）%7!=0:
        string_new = string_new+（7-len（string_new）%7）*‘0‘

    f=open（path1[0:-4]+‘_new‘+‘.txt‘‘wb‘）
    i=0
    while i        num= chr（int（string_new[i:i+7]2））                #显然每七位求其十进制数，然后求出对于的ascii码，存入文件
        f.write（str（num））
        i+=7

    f.write（chr（remain_num））
    f.close（）
def get_past_file（stringresultpath1）:       #解压压缩后的文件至一个新的文件中
    result_new={}                             #将键与值的位置反互换一下，方便查找
    for d in result:
        result_new[result[d]]=d
    b_string=‘‘                               #还原的二进制串
    for i in string:
        s_tmp=str（bin（ord（i）））
        s_tmp=s_tmp[2:]
        s_tmp=‘0‘*（7-len（s_tmp））+s_tmp
        b_string+=s_tmp
    
    k=b_string[-7:]                          #去掉最后加入的几个为了使其能整除7的几位以及大小信息的7个bit
    i = int（k2）

属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----

     文件       4206  2017-11-24 09:19  huffman.py

     文件     190066  2017-11-22 18:20  Aesop_Fables.txt

----------- ---------  ---------- -----  ----

               194272                    2

上一篇：爬取58同城
下一篇：Python人工智能AI深度学习全套课程.txt

共有条评论

python实现的使用huffman编码对文本的压缩与解压

资源简介

资源截图

代码片段和文件信息

评论

相关资源