Loading... ` 考察对Huffman编码的理解,程序可能略繁,量力而为。` In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters 'a', 'x', 'u' and 'z' are 4, 2, 1 and 1, respectively. We may either encode the symbols as {'a'=0, 'x'=10, 'u'=110, 'z'=111}, or in another way as {'a'=1, 'x'=01, 'u'=001, 'z'=000}, both compress the string into 14 bits. Another set of code can be given as {'a'=0, 'x'=11, 'u'=100, 'z'=101}, but {'a'=0, 'x'=01, 'u'=011, 'z'=001} is NOT correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not. ### Input Specification: Each input file contains one test case. For each case, the first line gives an integer **N** (**2****≤****N****≤****63**), then followed by a line that contains all the **N** distinct characters and their frequencies in the following format: ``` c[1] f[1] c[2] f[2] ... c[N] f[N] ``` where `c[i]` is a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and `f[i]` is the frequency of `c[i]` and is an integer no more than 1000. The next line gives a positive integer **M** (**≤****1000**), then followed by **M** student submissions. Each student submission consists of **N** lines, each in the format: ``` c[i] code[i] ``` where `c[i]` is the `i`-th character and `code[i]` is an non-empty string of no more than 63 '0's and '1's. ### Output Specification: For each test case, print in each line either "Yes" if the student's submission is correct, or "No" if not. Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct. ### Sample Input: ```in 7 A 1 B 1 C 1 D 3 E 3 F 6 G 6 4 A 00000 B 00001 C 0001 D 001 E 01 F 10 G 11 A 01010 B 01011 C 0100 D 011 E 10 F 11 G 00 A 000 B 001 C 010 D 011 E 100 F 101 G 110 A 00000 B 00001 C 0001 D 001 E 00 F 10 G 11 ``` ### Sample Output: ```out Yes Yes No No ``` ## 题解 1. 利用堆实现哈夫曼树,计算WPL与测试数据是否相同,若相同且2 2. 判断测试数据是否符合前缀编码(**前缀编码** 是指对字符集进行**编码** 时,要求字符集中任一字符的**编码** 都不是其它字符的**编码** 的**前缀**),如果符合,则Yes,否则No WPL的计算: (以下三者等价) 1. 树的带权路径长度等于叶子节点的带权路径长度之和 2. 非叶子结点的权值之和 3. 除根节点以外所有节点的权值之和 ``` #include <iostream> #include <cstring> using namespace std; const int N = 70; int n, m; int q[N]; char s[N][N]; class MinHeap{ public: MinHeap(int capac){ pRoot = new int[capac + 1]; size = 0; capacity = capac; pRoot[0] = -1; } bool isEmpty(){ return size == 0 ? true : false; } bool isFull(){ return size >= capacity ? true : false; } void insert(int ele){ if(isFull()) return; int i = ++size; while(ele < pRoot[i/2]){ pRoot[i] = pRoot[i/2]; i /= 2; } pRoot[i] = ele; } int pop(){ if(isEmpty()) return -1; int res = pRoot[1]; int lastEle = pRoot[size--]; int pParent = 1; int pChild = pParent*2; while(pChild <= size){ if(pChild != size && pRoot[pChild] > pRoot[pChild + 1]) pChild++; if(lastEle > pRoot[pChild]){ pRoot[pParent] = pRoot[pChild]; }else{ break; } pParent = pChild; pChild = pParent*2; } pRoot[pParent] = lastEle; return res; } void buildHeap(){ for(int i = size/2; i > 0; i--){ bmh(i); } } void add(int& ele){ if(isFull()) return; pRoot[++size] = ele; } void Print(){ for (int i = 1; i <= size; i++){ cout << pRoot[i] << " "; } cout << endl; } int HuffMan(){ int wpl = 0; int allSize = size; for(int i = 0; i < allSize - 1; i++){ int tmp1 = pop(); int tmp2 = pop(); wpl += tmp1 + tmp2; insert(tmp1 + tmp2); } return wpl; } private: void bmh(int pTmpRoot){ int pParent = pTmpRoot; int pChild = pParent*2; int tmp = pRoot[pParent]; while(pChild <= size){ if(pChild != size && pRoot[pChild] > pRoot[pChild + 1]) pChild++; if(tmp <= pRoot[pChild]) break; else pRoot[pParent] = pRoot[pChild]; pParent = pChild; pChild = pParent*2; } pRoot[pParent] = tmp; } private: int size; int capacity; int *pRoot; }; //判断前缀 bool isPre(char* s1, char* s2){ while(s1 && s2 && *s1 == *s2){ s1++; s2++; } if(!*s1 || !*s2) return true; else return false; } bool isPreCode(char s[][N], int n){ for(int i = 0; i < n; i++){ for(int j = i + 1; j < n; j++){ if(isPre(s[i], s[j])){ return true; } } } return false; } int main(){ char cTmp; int inTmp; int wpl; int num; cin >> n; MinHeap* pHeap = new MinHeap(N); for(int i = 0; i < n; i++){ cin >> cTmp; cin >> inTmp; pHeap->add(inTmp); q[i] = inTmp; } pHeap->buildHeap(); wpl = pHeap->HuffMan(); //cout << wpl << endl; cin >> m; //cout << "m is " << m << endl; for(int i = 0; i < m; i++){ num = 0; for(int j = 0; j < n; j++){ cin >> cTmp; cin >> s[j]; num += strlen(s[j])*q[j]; } //cout << "num is " << num << endl; if(num == wpl && !isPreCode(s, n)){ cout << "Yes" << endl; }else{ cout << "No" << endl; } //cout << num << endl; } return 0; } ``` 最后修改:2022 年 04 月 05 日 © 禁止转载 打赏 赞赏作者 支付宝微信 赞 0 如果觉得我的文章对你有用,请随意赞赏