<P>我把程序修改了一下,主要是在它的结束条件方面<BR>function tree=m_tree(features,targets,Nbins)</P>
<P>[Ni,L]=size(features); %得到非类别属性训练集的大小Ni*L<BR>Uc=unique(targets); %得到类别属性目标集的值 1 2 3 类<BR>U=length(unique(targets)); %得到目标集的属性值的长度3</P>
<P>%结束条件<BR>if (length(unique(targets))==1),%类别相同<BR> tree.split_dim=0;<BR> tree.child(a)=targets(U); %???想说明它是在哪个分枝出现,但不知道这个a如何给<BR> return<BR>end</P>
<P>if (L == 1),%剩最后一列,结束<BR> for a = 1:Nbins,<BR> tree.split_dim = 0;<BR> indices = find(features == a);<BR> if ~isempty(indices),<BR> if (length(unique(targets(indices))) == 1),<BR> tree.child(a) = targets(indices(1));<BR> else<BR> H = hist(targets(indices), Uc);<BR> [m, T] = max(H);<BR> tree.child(a) = Uc(T);<BR> end<BR> else<BR> tree.child(a) = inf;<BR> end<BR> end<BR> return<BR>end</P>
<P>%开始计算<BR>for a=1:U,<BR> Pnode(a)=length(find(targets==a))/Ni; <BR>end<BR>Inode=-sum(Pnode.*log(Pnode)/log(2)); <BR><BR>delta_Ib=zeros(1,L); <BR>P=zeros(1,U); <BR> for a=1:L, <BR> for k=1:Nbins, <BR> f=find(features(:,a)==k); <BR> V(k)=length(f);<BR> Y(k)=V(k)/Ni;<BR> if (V(k)~=0), <BR> for b=1:U, <BR> s=length(find(targets(f)==b)); <BR> P(b)=s/V(k); %P(s/v)<BR> end<BR> end<BR> Q=sum(-P.*log(eps+P)/log(2)); <BR> E(k)=Y(k).*Q; <BR> end <BR> info=sum(E); <BR> <BR> delta_Ib(a)=Inode-info; %H(U)-H(U|V)<BR> end<BR> [m,dim]=max(delta_Ib); <BR> tree.split_dim=dim;<BR> dims=find(~ismember(1:L,dim)); %剩余属性 <BR> for a=1:Nbins,<BR> indices=find(features(:,dim) ==a); %把在Ak处取值相同的例子归于同一子集,取几个值就得几个子集<BR> if (~isempty(indices)),<BR> tree.child(a)=m_tree(features(indices,dims),targets(indices),Nbins); %对既含正例又含反例的子集,递归调用建树算法<BR> end<BR> end</P>
<P> 执行结果: split_dim=1<BR> tree:[1*3 struct]<BR>中间都没有计算,递归也没实现<BR><BR>用的这个例子的计算最终应该是这种形式:<BR> split_dim:1<BR> / | \<BR> split_dim:2 1 split_dim:<BR> / \ / \<BR> 1 2 1 2<BR><BR><BR> 可能还是结束条件有问题吧,怎么样才能实现递归?</P> |