×

Loading...
Ad by
  • 推荐 OXIO 加拿大高速网络,最低月费仅$40. 使用推荐码 RCR37MB 可获得一个月的免费服务
Ad by
  • 推荐 OXIO 加拿大高速网络,最低月费仅$40. 使用推荐码 RCR37MB 可获得一个月的免费服务

里面是我的Perl Script, 麻烦你"多任务"一下. Perl 的 Multi-thread 是试验性的,不敢用。 如果用 fork( ) 做多几个 Processes,不但不好,而且如何让这几个Processes共同更新同一个变量?

本文发表在 rolia.net 枫下论坛难道要用 RPC? Share memory?

我在Java中一次开1000个thread, 的确比80个快很多 (其实是占用更多时间片吧) :)
如果用 Perl multi-process ( fork () ). 开80个 process? :-)

#!/usr/bin/perl -w

# Program name: from.pl => Check From: in header files
# then create a hash to see which is the highest

use strict;

$|=1;
my $imdata;
$imdata=`imconfget -m mss messageFilesDir`;
chomp $imdata;

if(!defined($ENV{INTERMAIL})) {
print "Proper environment not loaded. Please run me as 'imail'.\n";
exit;
}

my( $max, $now, $nLines, $filename);
my(%links, %value, @keys);

$max=10000;

$now = localtime();
$nLines=0;

sub dodir {
$nLines=0;
while (<>) {
if ( /MsMsgCreated/ ) {
/MsMsgCreated.+?\s+(.+?):/;
$filename=$1;
#/from=\<.+?@(.+?)\>/;
#$links{$1}++ if length $1;
$nLines ++;

doaction ($filename);


}
last if $nLines > $max;
}

}
sub doaction {
my ($file) = shift;
my ($result, $tmp);
open (THEFILE, "< ${imdata}/$file" ) or return 0;
$tmp=""; $result="";
while (<THEFILE>) {
chomp;
s/#.*//; #No comments
s/^\s+//; #no leading white
s/\s+$//; #no trailing white
next unless length;
if ( /^From:.*@([\w\.]+)/ ) {
$tmp=$1;
$tmp = "NULL" unless length $tmp;
$links{$tmp} ++ if length $tmp;
close (THEFILE);
return;
}
}

close (THEFILE);
}


dodir ();

@keys = sort { $links{$b} <=> $links{$a} } keys %links;
foreach (@keys) {

if ( $links{$_} > 10 ) {
print "$links{$_} ";
print "$_ ";
#print "$value{$_}\n" if ( defined $value{$_});
print "\n";
}
}
#print "Total number found is: $nLines \n";
exit;
__END__更多精彩文章及讨论,请光临枫下论坛 rolia.net
Report

Replies, comments and Discussions:

  • 工作学习 / IT技术讨论 / 各位Java达人,大虾。我有一个比较弱的问题,还望能不吝一救
    我原来有个Perl 的小script,
    主要是Match a key from log files,
    然后 $ahash {key} ++;
    这样我就知道哪种Key的出现次数最多
    只要 Sort the hash by value 然后 Print out 就行了
    非常容易,Sort by value 只要一句程序。

    我觉得可能Java multi-thread 可以快些,
    于是试着改写, 我发现Java 类似功能的Data structure 很多
    like:
    HashMap, HashSet, SortedMap, TreeSet, TreeMap
    于是我就用了TreeMap
    TreeMap tm = new TreeMap ( new MyComparator () );
    挺不错,
    可是,这是用 Key 来排序的,
    我希望用 Value 来排序。

    找了半天没找到。什么 Array.sort (), Collections.sort ()都不适用。
    search google 也没答案。
    最后只好又用了一个SortSet ,
    觉的Java没这么土吧。
    • why don't you create another TreeMap(value, key) using your old TreeMap(key, value)
      • 是不是一样多存了一次?我只想Sort,不想浪费大量空间再转存。一会儿,我把程序post出来,麻烦各位指点一下
        • 你的问题解决了吗?
          • 我用了一个TreeSet来解决的,速度可以,估计Java不用另外开空间来转存。效率还可以。
            问题是,Perl里面只用一行的 Sort Hash by Value.
            现在要写这么多?

            class ComparatorByValue implements Comparator {
            public int compare(Object o1, Object o2) {
            Map.Entry s1 = (Map.Entry) o1;
            Map.Entry s2 = (Map.Entry) o2;

            return ((Integer)s2.getValue()).compareTo((Integer)s1.getValue());
            }
            public boolean equals(Object o) {

            return compare(this, o)==0;
            }
            }

            TreeSet keySet = new TreeSet ( new ComparatorByValue ());
            keySet.addAll (tm.entrySet());
            Iterator iterator = keySet.iterator();
            while( iterator.hasNext() ) {
            Map.Entry entry = (Map.Entry)iterator.next();
            Integer myVal = (Integer) entry.getValue();
            if ( myVal.intValue() > 10 ) {
            System.out.println(entry.getKey() + "/" + entry.getValue()); }
            }
    • 我改了程序,发现Java Multi-thread 真的很牛!原来的Perl 要 5-10分钟,现在的Java只要10秒左右去 Parse 10000 Files.
      Red Hat Linux 9 and Java 2 Platform, Standard Edition 1.4.2: A Winning Combination
      http://developer.java.sun.com/developer/technicalArticles/JavaTechandLinux/RedHat/

      我用的是4 cpu的Solaris.
      所以速度更加理想
      • It means you didn't do it properly in Perl. it can also do multi-tasking...
        • 里面是我的Perl Script, 麻烦你"多任务"一下. Perl 的 Multi-thread 是试验性的,不敢用。 如果用 fork( ) 做多几个 Processes,不但不好,而且如何让这几个Processes共同更新同一个变量?
          本文发表在 rolia.net 枫下论坛难道要用 RPC? Share memory?

          我在Java中一次开1000个thread, 的确比80个快很多 (其实是占用更多时间片吧) :)
          如果用 Perl multi-process ( fork () ). 开80个 process? :-)

          #!/usr/bin/perl -w

          # Program name: from.pl => Check From: in header files
          # then create a hash to see which is the highest

          use strict;

          $|=1;
          my $imdata;
          $imdata=`imconfget -m mss messageFilesDir`;
          chomp $imdata;

          if(!defined($ENV{INTERMAIL})) {
          print "Proper environment not loaded. Please run me as 'imail'.\n";
          exit;
          }

          my( $max, $now, $nLines, $filename);
          my(%links, %value, @keys);

          $max=10000;

          $now = localtime();
          $nLines=0;

          sub dodir {
          $nLines=0;
          while (<>) {
          if ( /MsMsgCreated/ ) {
          /MsMsgCreated.+?\s+(.+?):/;
          $filename=$1;
          #/from=\<.+?@(.+?)\>/;
          #$links{$1}++ if length $1;
          $nLines ++;

          doaction ($filename);


          }
          last if $nLines > $max;
          }

          }
          sub doaction {
          my ($file) = shift;
          my ($result, $tmp);
          open (THEFILE, "< ${imdata}/$file" ) or return 0;
          $tmp=""; $result="";
          while (<THEFILE>) {
          chomp;
          s/#.*//; #No comments
          s/^\s+//; #no leading white
          s/\s+$//; #no trailing white
          next unless length;
          if ( /^From:.*@([\w\.]+)/ ) {
          $tmp=$1;
          $tmp = "NULL" unless length $tmp;
          $links{$tmp} ++ if length $tmp;
          close (THEFILE);
          return;
          }
          }

          close (THEFILE);
          }


          dodir ();

          @keys = sort { $links{$b} <=> $links{$a} } keys %links;
          foreach (@keys) {

          if ( $links{$_} > 10 ) {
          print "$links{$_} ";
          print "$_ ";
          #print "$value{$_}\n" if ( defined $value{$_});
          print "\n";
          }
          }
          #print "Total number found is: $nLines \n";
          exit;
          __END__更多精彩文章及讨论,请光临枫下论坛 rolia.net
      • 64 bit? E450?
    • 记得我在comp.lang.c++上看过一个类似的帖子,结论是there is now way of sortting by value for any map types.
      • java neither