- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

HuffmanTree

tianchou 发表于 2023/11/14 19:28:09 2023/11/14

【摘要】 HuffmanTree的常用操作及例题

title: 【HuffmanTree】
date: 2023-09-28 20:15:27
tags: 数据结构

哈夫曼树

定义

通过==最小堆（最小堆存放树的根结点）==来实现，每次拿出两个权值==最小的二叉树==进行合并，合并后的新树插入最小堆

注意

哈夫曼树属于 ==树==，也是链式存储
构建HuffmanTree之前必须先构建MinHeap

特点

==权值的个数即为叶子节点的个数==

HuffmanTree编码的**码字均在叶结点**上

哈夫曼树是一种带权路径长度最短的树，在一个度为m的哈夫曼树中，其叶结点个数为n，则非叶结点的个数为（n-1) / （m-1）。
答：叶结点即度为0的结点有n个；假设度为m的结点个数为x,则x+n=mx+1;也就是x=n-1/m-1;
若n-1不能被整除，即所给数据不能直接构造最优m叉树，这时需要加一些不影响建树的数据，可以添0；添加的个数为(m-1)-（(n-1)%(m-1)）。所以最终x应该为⌈n-1/m-1⌉ ，即向上取整；

联想：信息论进行m元Huffman编码进行压缩，每m个压缩一次后减少了m-1个，最后一次压缩可能需要补零。q+t=k(m-1)+m

操作

对象

typedef struct node 
{
	int weight;
    struct node* left,right;
}*HuffmanTree;

typedef struct HeapNode 
{
	HuffmanTree data[Maxsize];		//Attention!!
    int size;
}* MinHeap;

	HuffmanTree T=CreateHuffman();

哈夫曼树的创建(初始化)

HuffmanTree CreateHuffman()
{
	Huffman T=(Huffman)malloc(sizeof(struct TreeNode));
	T->left=T->right=NULL;
	T->weight=0;
	return T;
}

哈夫曼树的建立

HuffmanTree BuildHuffman(MinHeap H)
{
    HuffmanTree T;
    for(int i=1;i<H->size;i++)		/*做H->Size-1次合并*/
    {
        T=malloc(sizeof(struct node));		/*建立新结点*/
        T->left=Delete(H);			/*从最小堆中删除一个结点，作为新T的左子结点*/
        T->right=Delete(H);			/*再从最小堆中删除一个结点，作为新T的右子结点*/
        T->weight=T->left->weight+T->right->weight;		/*计算新权值*/
        Insert(H,T);		/*将新T插入最小堆*/
    }
    T=Delete(H);
    return T;
}

需要调用的MinHeap函数

最小堆CreatHeap

MinHeap CreatHeap()
{
    MinHeap H=(MinHeap)malloc(sizeof(struct HeapNode));
	H->data[0]=(Huffman)malloc(sizeof(struct TreeNode));
	H->data[0]->left=H->data[0]->right=NULL;
	H->data[0]->weight=-1;	//哨兵H->data[0]的值最小
	H->size=0;
    return H;
}

最小堆BuildHeap函数

void BuidHeap (MinHeap H)
{    
    HuffmanTree t=(HuffmanTree)malloc(sizeof(struct node));
    t->left=t->right=NULL;
    for(int i=0;i<n;i++)
    {
        cin>>t->weight;
        Insert(H,t);
    }
    return H;
}

最小堆Delete函数

需要把ElementType改变成HuffmanTree

将H->data[child] 改为H->data[child]->weight

将t改为t->weight

HuffmanTree Delete(MinHeap H)	//Attention!!!
{
//可有可无
//    if(H->size==0)
//        return;
    
    HuffmanTree min=H->data[1];	//取出根节点最小值，最后return
   /* 用最小堆中最后一个元素从根结点开始向上过滤下层结点 */ 
    HuffmanTree t=H->data[H->size--];
    int parent,child;
    for(parent=1;parent*2<=H->size;parent=child)	//若parent*2>H->size说明parent没有左儿子，也就更没有右儿子
    {
        child=parent*2;		//child指向左右儿子最小的那个，先初始赋值左儿子
        if(child!=H->size&&H->data[child]->weight > H->data[child+1]->weight)	//child!=H->size说明有右儿子
            child++;	
        if(t->weight <= H->data[child]->weight)	break;
        else
            H->data[parent]=H->data[child];
    }
    H->data[parent]=t;
    return min;
}

将H->Elements[]按==权值==H->Elements[]->weight调整为最小堆

第11行~~parent*=2~~错误，必须是parent=child,作用是parent索引变成儿子索引，向下交换

第9行~~data[H->size]~~错误，必须是data[H->size--]

最小堆Insert函数

需要把ElementType改变成HuffmanTree

将H->data[i/2] 改为H->data[i/2]->weight

将t改为t->weight

void Insert(MinHeap H,HuffmanTree t)
{
    int i;
    if(H->size>=Maxsize)
        return NULL;
    for(i=++H->size;t->weight < H->data[i/2]->weight;i/=2)
        H->data[i]=H->data[i/2];
    H->data[i]=t;
}

WPL的计算

int WPL(HuffmanTree T,int depth)
{
//注意：哈夫曼树没有度为1的节点 
	if(T->left==NULL&&T->right==NULL)	//左右子树都为空
		return depth*T->weight;
	else//递归去左右子树求权重，而且深度加1
		return WPL(T->left,depth+1)+WPL(T->right,depth+1);
}

初次调用WPL时：int sum=WPL(T,0);使depth的初始值为==0==；（因为==根节点的深度为0==）。这里的深度depth实际是码字长度

递归出口为根节点T->left==NULL&&T->right==NULL

递归关系为WPL(T->left,depth+1)+WPL(T->right,depth+1);

哈夫曼树没有度为1的节点

例题 05-树9 Huffman Codes

In 1953, David A. Huffman published his paper “A Method for the Construction of Minimum-Redundancy Codes”, and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string “aaaxuaxz”, we can observe that the frequencies of the characters ‘a’, ‘x’, ‘u’ and ‘z’ are 4, 2, 1 and 1, respectively. We may either encode the symbols as {‘a’=0, ‘x’=10, ‘u’=110, ‘z’=111}, or in another way as {‘a’=1, ‘x’=01, ‘u’=001, ‘z’=000}, both compress the string into 14 bits. Another set of code can be given as {‘a’=0, ‘x’=11, ‘u’=100, ‘z’=101}, but {‘a’=0, ‘x’=01, ‘u’=011, ‘z’=001} is NOT correct since “aaaxuaxz” and “aazuaxax” can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.

Input Specification:

Each input file contains one test case. For each case, the first line gives an integer N (2≤N≤63), then followed by a line that contains all the N distinct characters and their frequencies in the following format:

c[1] f[1] c[2] f[2] ... c[N] f[N]

where c[i] is a character chosen from {‘0’ - ‘9’, ‘a’ - ‘z’, ‘A’ - ‘Z’, ‘_’}, and f[i] is the frequency of c[i] and is an integer no more than 1000. The next line gives a positive integer M (≤1000), then followed by M student submissions. Each student submission consists of N lines, each in the format:

c[i] code[i]

where c[i] is the i-th character and code[i] is an non-empty string of no more than 63 '0’s and '1’s.

Output Specification:

For each test case, print in each line either “Yes” if the student’s submission is correct, or “No” if not.

Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.

Sample Input:

7
A 1 B 1 C 1 D 3 E 3 F 6 G 6
4
A 00000
B 00001
C 0001
D 001
E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 11

Sample Output:

Yes
Yes
No
No

代码长度限制

16 KB

时间限制

400 ms

内存限制

64 MB

[解题思路](数据结构_中国大学MOOC(慕课) (icourse163.org))

分析

代码

#include <iostream>
#include <unordered_map>
#define Maxsize 64
using namespace std;

//构建MinHeap和HeapNode结构体
typedef struct node 
{
	int weight;
    struct node* left, *right;
}*HuffmanTree;
typedef struct HeapNode 
{
	HuffmanTree data[Maxsize];		//Attention!!
    int size;
}* MinHeap;

//定义全局变量
int n,m,min_length;	char c; 	
unordered_map<char,int>cnt;

//函数声明
MinHeap CreatHeap();
HuffmanTree CreateHuffman();
void Insert(MinHeap H,HuffmanTree t);
HuffmanTree Delete(MinHeap H);
int WPL(HuffmanTree T,int depth);
bool judge();
 
int main()
{
	cin>>n;
    
//	建立最小堆 
	MinHeap H=CreatHeap();
    for(int i=0;i<n;i++)
    {
    	HuffmanTree t=CreateHuffman(); 
        cin>>c>>t->weight;
        cnt[c]=t->weight;
        Insert(H,t);
    }
    
//  建立哈夫曼树 
    HuffmanTree T;
    int n=H->size; 	
//  for(int i=1;i<H->size;i++)		 错误，H->size的值会在循环中改变 
    for(int i=1;i<n;i++)			/*做n-1次合并*/
    {
        T=(HuffmanTree)malloc(sizeof(struct node));		
        T->left=Delete(H);			
        T->right=Delete(H);			
        T->weight=T->left->weight+T->right->weight;		/*计算新权值*/
        Insert(H,T);		/*将新T插入最小堆*/
    }
    T=Delete(H);
    
//  计算最短长度 
	min_length=WPL(T,0);
//	判断 
	cin>>m;
	for(int i=0;i<m;i++)	
	{
		if(judge())		printf("Yes\n");
		else			printf("No\n");
	}
	return 0;
}

MinHeap CreatHeap()
{
    MinHeap H=(MinHeap)malloc(sizeof(struct HeapNode));
	H->data[0]=(HuffmanTree)malloc(sizeof(struct node));
	H->data[0]->left=H->data[0]->right=NULL;
	H->data[0]->weight=-1;			//哨兵H->data[0]的值最小
	H->size=0;
    return H;
}

HuffmanTree CreateHuffman()
{
	HuffmanTree T=(HuffmanTree)malloc(sizeof(struct node));
	T->left=T->right=NULL;
	T->weight=0;	//Attention！！
	return T;
}

void Insert(MinHeap H,HuffmanTree t)
{
//    可有可无 
//    if(H->size>=Maxsize)
//        return;
	int i;
    for(i=++H->size;t->weight < H->data[i/2]->weight;i/=2)
        H->data[i]=H->data[i/2];
    H->data[i]=t;
}

HuffmanTree Delete(MinHeap H)	
{
    if(H->size==0)
        return NULL;
    HuffmanTree min=H->data[1];		//取出根节点（weight最小），最后return
    
   /* 用最小堆中最后一个元素从根结点开始向上过滤下层结点 */ 
    HuffmanTree t=H->data[H->size--];
    int parent,child;
    for(parent=1;parent*2<=H->size;parent=child)	//若parent*2>H->size说明parent没有左儿子，也就更没有右儿子
    {
        child=parent*2;		//child指向左右儿子最小的那个，先初始赋值左儿子
        if(child!=H->size&&H->data[child]->weight > H->data[child+1]->weight)	//child!=H->size说明有右儿子
            child++;	
        if(t->weight<=H->data[child]->weight)	break;
        else
            H->data[parent]=H->data[child];
    }
    H->data[parent]=t;
    return min;
}

int WPL(HuffmanTree T,int depth)
{
//注意：哈夫曼树没有度为1的节点 
	if((T->left==NULL)&&(T->right==NULL))//左右子树都为空
		return depth*T->weight;
	else	//递归去左右子树求权重，而且深度加1
		return WPL(T->left,depth+1)+WPL(T->right,depth+1);
}

bool judge()
{
	int len=0;
    bool flag=1;
	string codes; 
	HuffmanTree T=CreateHuffman();      //模拟建树
	for(int i=0;i<n;i++)
	{
		cin>>c>>codes;
		if(codes.length()>=n)		//也可以没有这个判断
			return 0;
		else
		{
			HuffmanTree p = T;
			for(int j=0;j<codes.length();j++)
			{
				if(codes[j]=='0') 
				{
					if(!p->left)
						p->left = CreateHuffman();
					p = p->left;
					
				}
				else if(codes[j] == '1') 
				{
					if(!p->right)
						p->right = CreateHuffman();
					p = p->right;
				}
				if(p->weight) 	//说明已经被访问过了 
					flag = 0;	//不能直接打印，要把后面的读完 
			}
			if(p->left || p->right )		//说明该结点不是叶子节点 
				flag = 0;
			else
				p->weight = 1;
		}
		len += codes.length()*cnt[c];
	}
	if(len!=min_length)
		flag = 0;
	return flag;
}

注意

judge函数模拟建树过程中

左右移动过程中不能遇到已经访问的节点

最后赋值的节点必须是叶节点

节点weight起到标记flag的作用

【声明】本内容来自华为云开发者社区博主，不代表华为云及华为云开发者社区的观点和立场。转载时必须标注文章的来源（华为云社区）、文章链接、文章作者等基本信息，否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容，欢迎发送邮件进行举报，并提供相关证据，一经查实，本社区将立刻删除涉嫌侵权内容，举报邮箱： cloudbbs@huaweicloud.com

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

加入云驻计划，成为创作者

华为云周边好礼
免费体验产品
特殊身份标识
线下官方门票
内部专家零距离
与10000+优质创作者共同成长

立即加入

HuffmanTree

title: 【HuffmanTree】
date: 2023-09-28 20:15:27
tags: 数据结构

哈夫曼树

定义

注意

特点

操作

对象

哈夫曼树的创建(初始化)

哈夫曼树的建立

需要调用的MinHeap函数

最小堆CreatHeap

最小堆BuildHeap函数

最小堆Delete函数

最小堆Insert函数

WPL的计算

例题 05-树9 Huffman Codes

Input Specification:

Output Specification:

Sample Input:

Sample Output:

分析

代码

注意

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

HuffmanTree

title: 【HuffmanTree】 date: 2023-09-28 20:15:27 tags: 数据结构

哈夫曼树

定义

注意

特点

操作

对象

哈夫曼树的创建(初始化)

哈夫曼树的建立

需要调用的MinHeap函数

最小堆CreatHeap

最小堆BuildHeap函数

最小堆Delete函数

最小堆Insert函数

WPL的计算

例题 05-树9 Huffman Codes

Input Specification:

Output Specification:

Sample Input:

Sample Output:

分析

代码

注意

全部回复

设置昵称

关于作者

目录

热门推荐查看更多

相关文章

加入云驻计划，成为创作者

相关产品

title: 【HuffmanTree】
date: 2023-09-28 20:15:27
tags: 数据结构