mcm2012CModeling for Crime Busting 罪犯克星模型 美国数学建模 - matlab数学建模 - 谷速源码
下载频道> 资源分类> matlab源码> 数学建模> mcm2012CModeling for Crime Busting 罪犯克星模型 美国数学建模

标题:mcm2012CModeling for Crime Busting 罪犯克星模型 美国数学建模
分享到:

所属分类: 数学建模 资源类型:文档 文件大小: 415.44 KB 上传时间: 2019-08-18 23:55:01 下载次数: 162 资源积分:1分 提 供 者: jiqiren 20190818115631223
内容:
(MCM 2012C) 
 
Your organization, the Intergalactic Crime Modelers(ICM), is investigating a conspiracy to commit a criminal act. The investigators are highly confident they know several members of the conspiracy, but hope to identify the other members and the leaders before they make arrests. The conspirators and the possible suspected conspirators all work for the same company in a large office complex. The company has been growing fast and making a name for itself in developing and marketing computer software for banks and credit card companies. ICM has recently found a small set of messages from a group of 82 workers in the company that they believe will help them find the most likely candidates for the unidentified co-conspirators and unknown leaders. Since the message traffic is for all the office workers in the company, it is very likely that some (maybe many) of the identified communicators in the message traffic are not involved in the conspiracy. In fact, they are certain that they know some people who are not in the conspiracy. The goal of the modeling effort will be to identify people in the office complex who are the most likely conspirators. A priority list would be ideal so ICM could investigate, place under surveillance, and/or interrogate the most likely candidates. A discriminate line separating conspirators from non-conspirators would also be helpful to distinctly categorize the people in each group. It would also be helpful to the DA's office if the model nominated the conspiracy leaders. 
 
Before the data of the current case are given to your crime modeling team, your supervisor gives you the following scenario (called Investigation EZ) that she worked on a few years ago in another city. Even though she is very proud of her work on the EZ case, she says it is just a very small, simple example, but it may help you understand your task. Her data follow: 
 
The ten people she was considering as conspirators were: Anne#, Bob, Carol, Dave*, Ellen, Fred, George*, Harry, Inez, and Jaye#. (* indicates prior known conspirators, # indicate prior known nonconspirators) 
 
Chronology of the 28 messages that she had for her case with a code number for the topic of each message that she assigned based on her analysis of the message: 
 
Anne to Bob: Why were you late today (1) 
 
Bob to Carol: That darn Anne always watches me. I wasn't late. (1) 
 
Carol to Dave: Anne and Bob are fighting again over Bob's tardiness. (1) 
 
Dave to Ellen: I need to see you this morning. When can you come by Bring the budget files. (2) 
 
Dave to Fred: I can come by and see you anytime today. Let me know when it is a good time. Should I bring the budget files (2) 
 
Dave to George: I will see you later---lots to talk about. I hope the others are ready. It is important to get this right. (3) 
 
Harry to George: You seem stressed. What is going on Our budget will be fine. (2) (4) 
 
Inez to George: I am real tired today. How are you doing? (5) 
 
Jaye to Inez: Not much going on today. Wanna go to lunch today? (5) 
 
Inez to Jaye: Good thing it is quiet. I am exhausted. Can't do lunch today --- sorry! (5) 
 
George to Dave: Time to talk --- now! (3) 
 
Jaye to Anne: Can you go to lunch today? (5) 
 
Dave to George: I can't. On my way to see Fred. (3) 
 
George to Dave: Get here after that. (3) 
 
Anne to Carol: Who is supposed to watch Bob? He is goofing off all the time. (1) 
 
Carol to Anne: Leave him alone. He is working well with George and Dave. (1) 
 
George to Dave: This is important. Darn Fred. How about Ellen? (3) 
 
Ellen to George: Have you talked with Dave? (3) 
 
George to Ellen: Not yet. Did you? (3) 
 
Bob to Anne: I wasn't late. And just so you know --- I am working through lunch. (1) 
 
Bob to Dave: Tell them I wasn't late. You know me. (1) 
 
Ellen to Carol: Get with Anne and figure out the budget meeting schedule for next week and help me calm George. (2) 
 
Harry to Dave: Did you notice that George is stressed out again today? (4) 
 
Dave to George: Darn Harry thinks you are stressed. Don't get him worried or he will be nosing around. (4) 
 
George to Harry: Just working late and having problems at home. I will be fine. (4) 
 
Ellen to Harry: Would it be OK, if I miss the meeting today? Fred will be there and he knows the budget better than I do. (2) 
 
Harry to Fred: I think next year's budget is stressing out a few people. Maybe we should take time to reassure people today. (2) (4) 
 
Fred to Harry: I think our budget is pretty healthy. I don't see anything to stress over. (2) 
 
END of MESSAGE TRAFFIC 
 
Your supervisor points outs that she assigned and coded only 5 different topics of messages: 1) Bob's tardiness, 2) the budget, 3) important unknown issue but assumed to be part of conspiracy, 4) George's stress, and 5) lunch and other social issues. As seen in the message coding, some messages had two topics assigned because of the content of the messages. 
 
The way your supervisor analyzed her situation was with a network that showed the communication links and the types of messages. The following figure is a model of the message network that resulted with the code for the types of messages annotated on the network graph. 
 
 
 
Figure 1: Network of messages from EZ Case 
 
Your supervisor points out that in addition to known conspirators George and Dave, Ellen and Carol were indicted for the conspiracy based on your supervisor's analysis and later Bob self‐admitted his involvement in a plea bargain for a reduced sentence, but the charges against Carol were later dropped. Your supervisor is still pretty sure Inez was involved, but the case against her was never established. Your supervisor's advice to your team is identify the guilty parties so that people like Inez don't get off, people like Carol are not falsely accused, and ICM gets the credit so people like Bob do not have the opportunity to get reduced sentences. 
 
The current case: 
 
Your supervisor has put together a network‐like database for the current case, which has the same structure, but is a bit larger in scope. The investigators have some indications that a conspiracy is taking place to embezzle funds from the company and use internet fraud to steal funds from credit cards of people who do business with the company. The small example she showed you for case EZ had only 10 people (nodes), 27 links (messages), 5 topics, 1 suspicious/conspiracy topic, 2 known conspirators, and 2 known non‐conspirators. So far, the new case has 83 nodes, 400 links (some involving more than 1 topic), over 21,000 words of message traffic, 15 topics (3 have been deemed to be suspicious), 7 known conspirators, and 8 known non‐conspirators. These data are given in the attached spreadsheet files: Names.xls, Topics.xls, and Messages.xls. Names.xls contains the key of node number to the office workers' names. Topics.xls contains the code for the 15 topic numbers to a short description of the topics. Because of security and privacy issues, your team will not have direct transcripts of all the message traffic. Messages.xls provides the links of the nodes that transmitted messages and the topic code numbers that the messages contained. Several messages contained up to three topics. To help visualize the message traffic, a network model of the people and message links is provided in Figure 2. In this case, the topics of the messages are not shown in the figure as they were in Figure 1. These topic numbers are given in the file Messages.xls and described in Topics.xls. 
 
 
 
Figure 2: Visual of the network model of the 83 people (nodes) and 400 messages between these people (links). 
 
Requirements: 
 
Requirement 1: So far, it is known that Jean, Alex, Elsie, Paul, Ulf, Yao, and Harvey are conspirators. Also, it is known that Darlene, Tran, Jia, Ellin, Gard, Chris, Paige, and Este are not conspirators. The three known suspicious message topics are 7, 11, and 13. There is more detail about the topics in file Topics.xls. Build a model and algorithm to prioritize the 83 nodes by likelihood of being part of the conspiracy and explain your model and metrics. Jerome, Delores, and Gretchen are the senior managers of the company. It would be very helpful to know if any of them are involved in the conspiracy. 
 
Requirement 2: How would the priority list change if new information comes to light that Topic 1 is also connected to the conspiracy and that Chris is one of the conspirators? 
 
Requirement 3: A powerful technique to obtain and understand text information similar to this message traffic is called semantic network analysis; as a methodology in artificial intelligence and computational linguistics, it provides a structure and process for reasoning about knowledge or language. Another computational linguistics capability in natural language processing is text analysis. For our crime busting scenario, explain how semantic and text analyses of the content and context of the message traffic (if you could obtain the original messages) could empower your team to develop even better models and categorizations of the office personnel. Did you use any of these capabilities on the topic descriptions in file Topics.xls to enhance your model? 
 
Requirement 4: Your complete report will eventually go to the DA so it must be detailed and clearly state your assumptions and methodology, but cannot exceed 20 pages of write up. You may include your programs as appendices in separate files that do not count in your page restriction, but including these programs is not necessary. Your supervisor wants ICM to be the world's best in solving whitecollar, high‐tech conspiracy crimes and hopes your methodology will contribute to solving important cases around the world, especially those with very large databases of message traffic (thousands of people with tens of thousands of messages and possibly millions of words). She specifically asked you to include a discussion on how more thorough network, semantic, and text analyses of the message contents could help with your model and recommendations. As part of your report to her, explain the network modeling techniques you have used and why and how they can be used to identify, prioritize, and categorize similar nodes in a network database of any type, not just crime conspiracies and message data. For instance, could your method find the infected or diseased cells in a biological network where you had various kinds of image or chemical data for the nodes indicating infection probabilities and already identified some infected nodes? 
 
*Your ICM submission should consist of a 1 page Summary Sheet and your solution cannot exceed 20 pages for a maximum of 21 pages. 
 
罪犯克星模型(美国竞赛2012年C题) 
 
你的组织,银河犯罪建模中心(ICM),正在调查一个实施犯罪行为的阴谋。调查人员现在非常有信心,他们已经知道策划阴谋的一些成员,但是他们希望在逮捕嫌疑人之前确定其它的犯罪成员和组织的领导人。所有的嫌疑人和可能涉嫌的同谋都受雇于同一家公司,在一个大的综合办公室里工作。该公司发展迅速,正在开发和销售以自己的名字命名的计算机软件,该软件是为银行和信用卡公司服务的。ICM最近从公司的一组员工(有82人)那里获得了一些消息,他们认为这将帮助他们找到最有可能的未知身份的同谋者和组织领导人。由于公司中的所有员工都知晓该消息,所以一些消息的传播者(有可能很多)并没有卷入阴谋。事实上,他们可以确定有一些人没有卷入阴谋。建模工作的目标是确定在综合办公室里面的人谁最有可能是同谋者。一个优先级列表是最理想的,ICM可以按照优先级调查、监视或者审问最有可能的嫌疑人。一个判别是否为同谋人的分界线也是非常有用,可以用它来对各组人进行分类。对于检方来讲,如果模型能够识别出阴谋策划的领导人也是非常有帮助的。在你的犯罪建模团队获得当前案件的数据之前,你的上司给了你们下面的一些场景(被称作调查EZ),这些场景是几年前她在其他城市工作时遇到的。尽管她对她在EZ案件上的工作非常自豪,她仍然谦虚地说那是一个小的、简单的案例,但它可以帮助你了解你的任务。她的数据如下: 
 
她考虑为同谋者的十个人分别为:Anne#, Bob, Carol, Dave*, Ellen, Fred,George*, Harry, Inez和Jaye#。(*号表示事先已知是同谋者,#号表示事先已知为非同谋者) 
 
下面是28条消息的列表,这些消息是在她的案件中获得的,每条消息后面有一个标号,这个标号反映了她对于消息的主题的分析。 
 
安妮对鲍勃说:为什么你今天迟到了? 
 
鲍勃对卡罗尔说: 这该死的安妮总是看着我。我没有迟到。(1) 
 
卡罗尔对戴夫说: 对于鲍勃的迟到,安妮和鲍勃有争执。(1) 
 
戴夫对艾伦说:我需要今天早晨看见你。什么时间你能来?把预算文件带来。(2) 
 
戴夫对弗雷德说:今天我随时都可以来见你。如果时间合适就告诉我。我应该把预算文件带上吗?(2) 
 
戴夫对乔治说:我之后要见你---有很多话要说。我希望其他人做好准备。重要的是要得到这个权利。(3) 
 
哈里对乔治说:你似乎在强调。这是怎么回事?我们的预算很优秀。(2)(4) 
 
伊内兹对乔治说:我今天真的感觉很累。你感觉怎么样?(5) 
 
杰伊对伊内兹说:今天没有太多事可做。去吃午餐怎么样?(5) 
 
伊内兹对杰伊说:好想法,但是我筋疲力尽了,不能做午餐了,对不起!(5) 
 
乔治对戴夫说:谈话时间,现在!(3) 
 
杰伊对安妮说:你今天能去吃午餐吗?(5) 
 
戴夫对乔治说:我不能。我要在回家的路上去看弗雷德。(3) 
 
乔治对戴夫说:那之后到这。(3) 
 
安妮对卡罗尔说:谁应该去看看鲍勃?他正在消磨时间。(1) 
 
卡罗尔对安妮说:别理他。他和乔治、戴夫相处的很好。(1) 
 
乔治对戴夫说:这是非常重要的。该死的弗雷德。艾伦如何?(3) 
 
艾伦对乔治说:你和戴夫谈话了?(3) 
 
乔治对艾伦说:还没有。你呢?(3) 
 
鲍勃对安妮说:我没有迟到。我要让你知道---午餐时间我也在工作。(1) 
 
鲍勃对戴夫说:告诉他们我没有迟。你知道的。(1) 
 
艾伦对卡罗尔说:去找安妮,弄清楚下周预算会议的具体日程,并且帮助我让乔治冷静一下。(2) 
 
哈里对戴夫说:你没有注意到今天乔治又压力很大?(4) 
 
戴夫对乔治说:该死的哈里认为你压力很大。不要让他担心或者别让他察觉。(4) 
 
乔治对哈里说:仅仅是因为工作太晚了,家里也出了点问题。我很好。(4) 
 
艾伦对哈里说:如果我错过了今天的会议,一切还会都好吗?弗雷德在会议上,他知道的预算比我做的好。(2) 
 
哈里对弗雷德说:我认为明年的财政预算案使一些人压力很大。也许我们应该花些时间来让人们放心。(2)(4) 
 
弗雷德对哈里说:我认为我们的预算是非常健康的。我没有看到任何压力。(2) 
 
消息流完毕。 
 
你的上司指出,她分配并编码了仅仅5种不同的消息主题: 1)鲍勃的迟到,2)预算,3)重要但未知的问题,被认为是阴谋的一部分,4)乔治的压力,5)午餐和其他社会问题。 正如所看到的消息编码,一些消息因为其内容被和两个主题联系在了一起。你的上司分析情况采用的方法是一个网络,它显示了消息的通讯连接情况和消息的类型。下图是一个消息网络模型,网络图上注明了消息类型的代码。 
 
图1:EZ案件的消息网络 
 
你的上司指出,除了已知的同谋乔治和戴夫,根据她的分析,艾伦和卡罗尔分别被因为同谋起诉,后来鲍勃自己认罪,被判处减刑。但对卡罗尔的起诉后来被撤销了。你的上司仍然坚信伊内兹参与其中,但对于她的诉讼始终没能成立。你的上司给你们团队建议,一定要明确人群中有罪的一部分人,像伊内兹这样的人不能漏网,像卡罗尔这样的人也不能被错误地起诉,并且ICM得到证据,像鲍勃这样的人就没有机会获得减刑。 
 
目前情况下,你的上司已经得到了一个网络形式的数据库,它有着相同的结构,但是在规模上稍大一些。有一些迹象表明,一个阴谋正在从公司挪用资金,并且使用网络欺诈窃取与公司做生意的的人的信用卡内的资金。她给你示范了一个小例子,在EZ情况下只有10个人(节点),27条边(信息),5个主题,一个可疑的/阴谋的主题,2个已知的同谋者,还有2个已知的非同谋者。 目前,这个新的情况下,有83个节点,400条边(其中一些包含不止一个主题),超过21000个字符的信息传输,15个主题(3个被视为是可疑的),7个是已知的同谋者,还有8个已知的非同谋者,数据在给出的附件:Names.xls, Topics.xls,Messages.xls和Names.xls中,names.xls包含办公室员工的姓名,和节点的数目一样。topics.xls包含了15个主题的代码和简短描述。由于安全和隐私的问题,你的团队将不能得到所有信息流的副本。messages.xls提供链接节点的,用来传递信息的边,信息中包含数字代码。一些信息包含了三个主题。为了可视化信息流动,对于人和信息传播的网络模型如图2所示。图上没有像图1一样标注消息的主题。这些主题的编号在文件Messages.xls中给出,主题描述在Topics.xls中给出。 
 
图2:可视化的网络模型,包含83个人(节点)和400条他们之间的信息(边) 
 
要求: 
 
要求一: 目前,已知Jean, Alex, Elsie, Paul, Ulf, Yao, 和Harvey是同谋者,还知道Darlene, Tran, Jia, Ellin, Gard, Chris, Paige, 和Este不是同谋者。3个已知的可疑信息主题为7,11和13.更多的主题细节请见附件Topics.xls。根据83个节点为阴谋的可能性的大小,建立一个模型和算法对可能性大小进行排序,并说明你的模型及流程。Jerome, Delores和Gretchen是公司的高级管理人员,如果能知道他们中的任何一个是否参与了这个阴谋将是十分有用的。 
 
要求二: 如果得到新的消息,主题一和阴谋有关,且Chris是同谋者之一,请问要求一中的排序会是什么样子的? 
 
要求三: 和这个消息传输类似的一种强大的用来获得和理解文本信息的技术被称为语义网消息传输分析。作为一个在人工智能和计算语言学的方法,它为知识推理和语言提供了一个结构和过程。另一种在自然语言处理能力方面的计算语言学叫做文本分析。在我们的犯罪现场破坏条件下,解释语义和文本内容的分析和消息传输的背景(如果你能获得原始信息)能使您的团队开发更好的关于办公室人员的模型和归类。你是否在文件Topics.xls中有使用这些功能来进行主题描述,从而提升你的模型? 
 
要求四: 你完成的报告最终将送给检察官,所以报告中必须详细的,清晰的陈述你的假设和方法论。但是报告不能超过20页。你可以在分开的文件中包含你的程序作为附录, 附录不算在你的页数内,但这些附录不是必要的。你的上司希望ICM在解决白领,高技术的阴谋犯罪方面是世界上最好的。希望你提供的方法将有助于解决世界各地重大案件,尤其是那些拥有非常大消息传输的数据库(成千上万的人,成千上万的信息,可能数百万字)。她特别要求你在报告中要包含可以帮助你的信息模型和建议的讨论如何更深入的网络、语义和文本内容的分析的内容。作为你向她报告的一部分,说明你使用的网络建模技术,和你为什么使用它以及如何使用它在任何类型的网络数据库用来识别,优先和分类相似节点,而不仅仅是犯罪阴谋和信息数据。 例如,在得到节点感染概率和部分已经确认感染节点的各种图像或化学数据的生物网络中,你的方法能否找到感染或患病的细胞的位置? 
 
*你提交的ICM论文应该包含一页摘要和不超过20页的解决方案,总计不超过21页。 

文件列表(点击上边下载按钮,如果是垃圾文件请在下面评价差评或者投诉):

2012_ICM/
2012_ICM/2012_ICM_Problem.pdf
2012_ICM/Messages.xls
2012_ICM/Names.xls
2012_ICM/Topics.xls
mcm2012CModeling for Crime Busting 罪犯克星模型 .pdf

关键词: mcm2012CModeling for Crime Busting 罪犯克星模型 美国数学建模

Top_arrow
回到顶部
联系方式| 版权声明| 招聘信息| 广告服务| 银行汇款| 法律顾问| 兼职技术| 付款方式| 关于我们|
网站客服网站客服 程序员兼职招聘 程序员兼职招聘
沪ICP备19040327号-3
公安备案号:沪公网安备 31011802003874号
库纳格流体控制系统(上海)有限公司 版权所有
Copyright © 1999-2014, GUSUCODE.COM, All Rights Reserved