Analysis posts of the social network Twitter with the stream processing systems Apache Spark and Apache Storm
Abstract
Full Text:
PDF (Russian)References
Hesla, “Particle physics tames big data” http://www.symmetrymagazine.org/article/august-2012/particle-physics-tames-big-data
Hirak Kashyap, Hasin Afzal Ahmed, “Big Data Analytics in Bioinformatics: A Machine Learning Perspective” http://arxiv.org/pdf/1506.05101.pdf
Eric D. Feigelson and G. Jogesh Babu, “Big data in astronomy” http://astrostatistics.psu.edu/2012Significance.pdf
Saeed Shahrivari and Saeed Jalili, “Beyond Batch Processing: Towards Real-Time and Streaming Big Data” https://arxiv.org/ftp/arxiv/papers/1403/1403.3375.pdf
Zeba Khanam and Shafali Agarwal, “Map-Reduce Implementations: Survey and Performance Comparison” http://airccse.org/journal/jcsit/7415ijcsit10.pdf
Apache Hadoop http://hadoop.apache.org/
Andrew C.Oliver, “Storm or Spark: Choose your real-time weapon” http://www.infoworld.com/article/2854894/application-development/spark-and-storm-for-real-time-computation.html
Dokumentacija Apache Spark http://spark.apache.org/docs/latest/
Dokumentacija Apache Storm http://storm.apache.org/releases/current/index.html
Dokumentacija Apache Kafka http://kafka.apache.org/documentation.html
Twitter Streaming API https://dev.twitter.com/streaming/overview
Apache Flume https://flume.apache.org/
Amazon Kinesis Streams https://aws.amazon.com/ru/kinesis/streams/
Apache Zookeeper https://zookeeper.apache.org/
Dokumentacija AWS EC2 https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html
Apache Hadoop YARN https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/YARN.html
Apache Mesos http://mesos.apache.org/
Matei Zaharia, Tathagata Das, et al., “Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing” https://www2.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-259.pdf
Sanket Chintapalli, Derek Dagit, Bobby Evans, et al., “Benchmarking Streaming Computation Engines at Yahoo!” https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
Apache Flink https://flink.apache.org/
Ishodnye kody testa proizvoditel'nosti ot Yahoo! https://github.com/yahoo/streaming-benchmarks
Peter F. Brown, Peter V. deSouza, Robert L. Mercer, et al., “Class-Based n-gram Models of Natural Language” http://www.aclweb.org/anthology/J92-4003
Alberto Barr´on-Cede˜no and Paolo Rosso, “On Automatic Plagiarism Detection Based on n-Grams Comparison” http://users.dsic.upv.es/~prosso/resources/BarronRosso_ECIR09.pdf
William B. Cavnar and John M. Trenkle, “N-Gram-Based Text Categorization” http://odur.let.rug.nl/~vannoord/TextCat/textcat.pdf
David Sundby, “Spelling correction using N-grams” http://fileadmin.cs.lth.se/cs/education/EDA171/Reports/2009/david.pdf
Hosebird Client https://github.com/twitter/hbc
Twitter Apps https://apps.twitter.com/
Ishodnyj kod programmy-prodjusera, otpravljajushhej tvity v Kafka https://github.com/GorshkovNikita/kafka-test
Ishodnyj kod programm dlja dvizhkov Spark i Storm https://github.com/GorshkovNikita/streaming-engines-comparison
Jonathan Leibiusky, Gabriel Eisbruch and Dario Simonassi, “Getting Started with Storm”
Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia, “Learning Spark”
Refbacks
- There are currently no refbacks.
Abava Кибербезопасность IT Congress 2024
ISSN: 2307-8162