视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
eclipse中开发Hadoop2.x的Map/Reduce项目
2020-11-09 13:13:54 责编:小采
文档

本文演示如何在Eclipse中开发一个Map/Reduce项目: 1、环境说明 Hadoop2.2.0 Eclipse?Juno SR2 Hadoop2.x-eclipse-plugin 插件的编译安装配置的过程参考:http://www.micmiu.com/bigdata/hadoop/hadoop2-x-eclipse-plugin-build-install/ 2、新建MR工程 依次

本文演示如何在Eclipse中开发一个Map/Reduce项目: 1、环境说明
  • Hadoop2.2.0
  • Eclipse?Juno SR2
  • Hadoop2.x-eclipse-plugin 插件的编译安装配置的过程参考:http://www.micmiu.com/bigdata/hadoop/hadoop2-x-eclipse-plugin-build-install/
  • 2、新建MR工程 依次点击 File →?New →?Ohter... ?选择 “Map/Reduce Project”,然后输入项目名称:micmiu_MRDemo,创建新项目: 3、创建Mapper和Reducer 依次点击?File →?New →?Ohter... 选择Mapper,自动继承Mapper 创建Reducer的过程同Mapper,具体的业务逻辑自己实现即可。 本文就以官方自带的WordCount为例进行测试:
    package com.micmiu.mr;
    /**
     * Licensed to the Apache Software Foundation (ASF) under one
     * or more contributor license agreements. See the NOTICE file
     * distributed with this work for additional information
     * regarding copyright ownership. The ASF licenses this file
     * to you under the Apache License, Version 2.0 (the
     * "License"); you may not use this file except in compliance
     * with the License. You may obtain a copy of the License at
     *
     * http://www.apache.org/licenses/LICENSE-2.0
     *
     * Unless required by applicable law or agreed to in writing, software
     * distributed under the License is distributed on an "AS IS" BASIS,
     * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     * See the License for the specific language governing permissions and
     * limitations under the License.
     */
    import java.io.IOException;
    import java.util.StringTokenizer;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.mapreduce.Mapper;
    import org.apache.hadoop.mapreduce.Reducer;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.util.GenericOptionsParser;
    public class WordCount {
     public static class TokenizerMapper 
     extends Mapper{
     private final static IntWritable one = new IntWritable(1);
     private Text word = new Text();
     public void map(Object key, Text value, Context context
     ) throws IOException, InterruptedException {
     StringTokenizer itr = new StringTokenizer(value.toString());
     while (itr.hasMoreTokens()) {
     word.set(itr.nextToken());
     context.write(word, one);
     }
     }
     }
     public static class IntSumReducer 
     extends Reducer {
     private IntWritable result = new IntWritable();
     public void reduce(Text key, Iterable values, 
     Context context
     ) throws IOException, InterruptedException {
     int sum = 0;
     for (IntWritable val : values) {
     sum += val.get();
     }
     result.set(sum);
     context.write(key, result);
     }
     }
     public static void main(String[] args) throws Exception {
     Configuration conf = new Configuration();
     String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
     if (otherArgs.length != 2) {
     System.err.println("Usage: wordcount  ");
     System.exit(2);
     }
     //conf.set("fs.defaultFS", "hdfs://192.168.6.77:9000");
     Job job = new Job(conf, "word count");
     job.setJarByClass(WordCount.class);
     job.setMapperClass(TokenizerMapper.class);
     job.setCombinerClass(IntSumReducer.class);
     job.setReducerClass(IntSumReducer.class);
     job.setOutputKeyClass(Text.class);
     job.setOutputValueClass(IntWritable.class);
     FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
     FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
     System.exit(job.waitForCompletion(true) ? 0 : 1);
     }
    }
    4、准备测试数据 micmiu-01.txt:
    Hi Michael welcome to Hadoop 
    more see micmiu.com
    micmiu-02.txt:
    Hi Michael welcome to BigData
    more see micmiu.com
    micmiu-03.txt:
    Hi Michael welcome to Spark 
    more see micmiu.com
    把 micmiu 打头的三个文件上传到hdfs:
    micmiu-mbp:Downloads micmiu$ hdfs dfs -copyFromLocal micmiu-*.txt /user/micmiu/test/input
    micmiu-mbp:Downloads micmiu$ hdfs dfs -ls /user/micmiu/test/input
    Found 3 items
    -rw-r--r-- 1 micmiu supergroup 50 2014-04-15 14:53 /user/micmiu/test/input/micmiu-01.txt
    -rw-r--r-- 1 micmiu supergroup 50 2014-04-15 14:53 /user/micmiu/test/input/micmiu-02.txt
    -rw-r--r-- 1 micmiu supergroup 49 2014-04-15 14:53 /user/micmiu/test/input/micmiu-03.txt
    micmiu-mbp:Downloads micmiu$
    5、配置运行参数 Run As →?Run Configurations… ,在Arguments中配置运行参数,例如程序的输入参数: 6、运行 Run As -> Run on Hadoop ,执行完成后可以看到如下信息: 到此Eclipse中调用Hadoop2x本地伪分布式模式执行MR演示成功。 ps:调用集群环境MR运行一直失败,暂时没有找到原因。 —————– ?EOF?@Michael Sun?—————– 下载本文
    显示全文
    专题