MAY 2, 2004

Date:   Sunday May 2, 2004.
Time:   1:00 p.m. - 5:00 p.m., approximately
Place:  Boston Park Plaza Hotel 

Attendees: Robert Frederking,  Graeme Hirst, Lillian Lee,
           Diane Litman, Kathy McCoy, Dragomir Radev,  
           Ellen Riloff, Janyce Wiebe

Guests:    Christine Doran, Julia Hirschberg, Priscilla Rasmussen

Addendum:  Report of the General Chair and SubChairs, HLT/NAACL-2004 (Excerpts)


  1. HLT/NAACL-2004 conference
  2. ACL 2005
  3. NAACL financial data
  4. Bids for NAACL 2006
  5. Summer schools
  6. Position on an HLT meeting in 2005
  7. Misc
[1] HLT/NAACL-2004 CONFERENCE Julia Hirschberg reviewed the report (excerpted below) of the general chair and sub-chairs for HLT/NAACL 2004. She noted that all the reports were in within one day of the announced deadline, and all but one were in on time. Below are points not otherwise raised in the conference report. -- About registration -- There was some discussion of the number of registrations this year, and especially about whether and/or why the student registrations were relatively low. Kathy McCoy reported that there were a few problematic registrations, including redundant registrations, possibly caused by web-server delays being interpreted as failures. It is recommended to delete the "I'm a invited speaker and so don't have to pay" option from the online registration form, since people were confused as to what "invited speaker" status is. Otherwise, online registration was pretty smooth. -- About the reviewing process for the main conference -- Of the 19 submissions that were rejected as long papers but invited to be revised and re-submitted as short short papers, 5 were re-submitted. There were fewer short-paper submissions than expected. Some reviewers felt they did not receive papers in their area of expertise; this might be due to the flexibility of the Area Chairs (which indicates that "Senior Program Committee" might be a better name). The START conference-review software allows conference chairs and reviewers to see the identity of reviewers. Cyberchair doesn't have this property, but was not necessarily recommended. START licensing fess are approximately $500 for the first conference, $400 for the second, and then $100 for each subsequent one. -- About organization -- A general problem was that lines of authority weren't always clear. For example, it had been generally assumed that the ACL Business Manager or the ACL Treasurer set the fees, but instead, it turned out to be the responsibility of the NAACL Treasurer. A complicating factor is that tradition and the ACL Conference Handbook (which is ACL-meeting specific) interact in complex ways. For instance, according to the Handbook, co-located workshops are the responsibility of the General Chair, but the Workshop Chair(s) might be a better choice. Perhaps the Handbook could be "parametrized" for chapter meetings, or could specify where latitude exists. Also, perhaps a particular NAACL Executive board member could be designated as the one in charge of keeping an eye on the conference, and determine which questions need to be passed up to the ACL Executive. One topic for consideration is how to better incorporate institutional memory; one hurdle is that the chairs and sub-chairs all change every year. Perhaps, given the three-(sub)chair structure, one of the chairs could serve as a co-chair for the following year. Another question was the status of a proposal making the sponsorship chair a paid position that is filled by the same person from one year to the next. -- About the Student Research Workshop -- Funding for non-USA students to attend the Student Research Workshop was a problem. NSF may prefer USA students ("may" because it was noted that for a AAAI spring symposium, NSF and ONR didn't seem to distinguish USA from non-USA students). The need to use US carriers was a major problem. One positive aspect was that the overhead on the NSF funding was low, perhaps because of low overhead rates at the institution involved (Ohio State). IBM sponsored in full an evening event. The submission rate was low. Perhaps this is due to interactions with the short papers of the main session. One possibility would be to funnel some rejected short-paper submissions to the Student Research Workshop submission pool. -- About the workshops -- There was much discussion about "exception" workshops, such as SIG meetings and/or workshops that want different treatment from the "standard" set-up (examples: meeting off-site, meeting at a different day than the designated workshop days). "Exception" workshops can cause a great deal of overhead for organizers, especially because it is currently unclear who is in charge of what and what policies apply for such cases (examples: Who pays for catering? Should a "register for workshop X only" (and thus not paying the general-conference registration fee) option be allowed?) There were some problems with workshop organizers making up their own schedules without having been told about the communal breaks schedule. -- About local arrangements -- Christine Doran reviewed her report. Negotiations with hotels and related entities were done before MITRE's bid to host the meeting was approved because hotel costs had to be submitted with the final bid materials. No hotels of a suitable size were to be found in Cambridge. An advantage of dealing with a hotel, rather than a very large venue, is that extra negotiating leverage results from being a very big customer from the a hotel's point of view. In the end, a very good room rate was acquired (as evidence, many people are staying extra nights beyond the conference; some did complain and found alternate housing in hostels or the YMCA). The meeting substantially exceeded the room contract, and the hotel, being very pleased with this threw in some extras. A perk for those who booked rooms through NAACL was that they could make free 800 and local calls. The central locale was a plus, although there were some problems with traffic. The hotel provided swan pins, subsequently dubbed "swans of power" to those organizers who were authorized to make decisions on the meeting's behalf to make the visually identifiable to hotel staff. This was a nice system, although it provoked the question of whether there were one swan to rule them all. Wireless hubs were purchased. Because there were multiple speech demos, there was a need to give the demos extra space. This required re-arranging the demo and poster space. The total budgeted cost for the banquet was $17310, for 400 people. At the time, about 200 tickets, at $65 each, had been sold. MITRE received some money for administrative support; in return, some of their administrative staff helped out on site. MITRE also set up a "holding" account, which allowed for tax-exempt status to apply and which made paying costs easier. There was some confusion about who was authorized to sign contracts, write checks, and so on (the NAACL Treasurer? the General Chair? the Business Manager?). -- Other topics -- Because some of the tutorials had low pre-registration numbers, it might be necessary to reduce the number of tutorials from 6 (two per area), especially since it was observed that in general the number of people who attend tutorials seems to remain constant. Steven Bird, Dragomir Radev, and Sandra Carberry may have found a good solution to the problem of the ACL website location. Problems with Omnipress include the fact that they didn't handle the arrangements for proceedings delivery well. The integration of the three main areas could be improved. (Return to top of minutes) [2] ACL 2005 A reminder was given that although Dragomir Radev is NAACL Treasurer and therefore in usual circumstances would be an ex-officio member of the oversight committee for ACL 2005, because he is Local Arrangements chair for that conference, he will not be on the oversight committee to avoid conflicts of interest. Dragomir Radev reviewed his presentation for the business meeting. At the time, preparation of the final budget and general-chair selection were still in progress. A point of discussion was whether to include a local, i.e., North American, flavor to ACL 2005; an example would be a Latin American panel. It was suggested that someone spearhead whatever effort is undertaken. (Return to top of minutes) [3] NAACL FINANCIAL DATA The 2004 balance was $21.8K. The Summer School cost $10K, and $4K were received from ACL 2002 (this latter number had initially been predicted to be higher). (Return to top of minutes) [4] BIDS FOR NAACL 2006 Janyce Wiebe presented a draft of the preliminary call for bids for review. Possible locations were discussed. (Return to top of minutes) [5] SUMMER SCHOOLS Reminder: This is the final year of the agreement between NAACL and the Johns Hopkins CLSP summer school. In review, it seems like an expensive (approximately $1000 per student), but fruitful arrangement. Approved: a proposal, with attendant conditions, for NAACL sponsorship of the Linguistic Society of America's 2005 summer school. (Return to top of minutes) [6] POSITION ON AN HLT MEETING IN 2005 Previous discussions on this topic were reviewed: the general question was what role NAACL should play (advisory? financial? none?) with respect to an HLT meeting in 2005, should one occur. Scenarios discussed included (a) HLT being co-located or co-organized with a different meeting than ACL 2005, for example, CoNLL, IJCAI, or SIGDAT; (b) the HLT meeting being a workshop at ACL 2005; and (c) ACL 2005 and the HLT meeting co-locating. These options could impact people's perceptions of the HLT meeting's prestige and HLT tradition in different ways. Scenario (b) had the potential to raise the "Exception Workshop" concerns discussed above (see "About the workshops"). Advantages of (c), co-location, include maintaining a positive influence on the integration of speech, IR, and CL; disadvantages include organizational questions and conflicts of deadlines. It was also noted that in general, ACL treats co-located entities as workshops, with ACL taking an overhead percentage, with larger meetings paying a larger rate. [As of September 2004, option (a), with SIGDAT, had been chosen.] (Return to top of minutes) [7] MISC. Plans were made for what to say at the HLT-NAACL 2004 Business Meeting --- announcement of election winners, solicitation of nominations from-- the membership at large, sundry thanks. It was agreed that we would hold off on any constitutional issues until June, when election business arises. The nominating committee should be contacted in the summer. Another phone meeting will occur in late fall. (Return to top of minutes) END OF MEETING ****************************************************************************

Excerpts from the Report of the General Chair and SubChairs, HLT/NAACL-2004

Note: formatting alterations stem from conversion to .txt and some post-hoc format editing Contents (Return to top of HLT-NAACL report; return to top of minutes) General Chair Report Julia Hirschberg, Columbia High Points: Low Points/Suggestions: (Return to top of HLT-NAACL report; return to top of minutes) Local Arrangements Chair Report Christine Doran, Mitre Corp, cdoran@mitre.com MITRE was solicited for a bid to host HLT/NAACL 2004, and to the best of my knowledge, our bid was the only one submitted. From the beginning, we were clear that we wanted a downtown event, which meant a hotel, and the board expressed two major concerns with that plan, based on previous problems with having the meeting in a hotel. First, the cost of the hotel rooms, as Boston is an expensive city, and second, the issue of committing to a block of hotel rooms. We considered four hotels in downtown Boston, the Sheraton, the Park Plaza, the Marriott and the Westin. The Park Plaza offered the best room rate ($169 when we started negotiations) and after the site visit, there was a clear consensus that this hotel would be the best fit for us, in terms of size, prize, location, amenities and character. After quite a few rounds of negotiations, we agreed to a block of 1090 rooms at $139/night for singles and $144 for doubles, and paid a token $1500 for all of our meeting spaces. As of this writing, we have booked 1364 room nights. The main downside with using a hotel was that we had to reserve our meeting spaces well in advance of the event. We were able to make some last minutes moves and adjustment, but for the most part were constrained by our early speculation as to how many and how large our meeting rooms were. It remains to be seen how good these guesses were. Banquet: We looked at several historic buildings and museums. We really liked the Fogg Art Museum at Harvard, but it was very expensive to rent. We decided good and plentiful food was more important than art, so we went instead with the Old South Meeting House. The OSMH figures prominently in Boston's history, is walking distance from the hotel and is a beautiful space. We spent the bulk of our banquet budget on food, from a small local caterer. Computers: We rented 10 PCs from a national chain, Rent-A-PC. They were used for the 2002 Philadelphia ACL meeting and came highly recommended. Audio-Visual: We are using the hotel's in-house A/V company. We eventually negotiated a significant discount with them (to match our other best bids), and felt the on-site support would be valuable, despite some communications problems. It turns out that our main contact person was in and out of the hospital while we were trying to finalize our contract. Finances: The handling of finances was a particularly confusing issue. It was only very late in the process that we learned that the local arrangements folks typically handle the bulk of the finances. It would have been useful to clearly specify at the beginning how the money would flow for particular income and expenses. Printing; We used MITRE's in-house publications and graphics people to do the graphics for the bag, banner and thermos, and for copies of tutorial proceedings, etc. (Return to top of HLT-NAACL report; return to top of minutes) Program Committee Chairs Report Susan Dumais, Microsoft Research, sdumais@microsoft.com Daniel Marcu, ISI/USC, marcu@isi.usc.edu Salim Roukos, IBM, roukos@us.ibm.com 1. Schedule Nov 15, 2003 Submission deadline for Full papers Jan 17, 2004 PC meeting Jan 23, 2004 Notification accept/reject for Full papers Feb 4, 2004 Submission deadline for Late-breaking short papers and posters Mar 8, 2004 Notification accept/reject for Late-breaking short papers and posters Mar 17, 2004 Camera-ready copy for Full papers, Late-breaking papers, Posters May 2-7, 2004 Conference 2. Overview remarks The co-chairs represent the three main fields covered by HLT/NAACL 2004 --- Susan Dumais (IR), Daniel Marcu (NLP) and Salim Roukos (Speech). We divided a few tasks like suggestions for reviewers and assignment to area chairs by discipline, but most tasks cut across the disciplines. We divided the work roughly as follows, although everyone responded to email and issues as they came up: Susan, early activities like recruiting, paper templates, web info and final report; Daniel, review software, site management and PC meeting hosting; Salim, publications and final schedule. This minimal coordination worked well in general. 3. Paper reviewing process We think the paper reviewing process went very well. Despite the early deadline, the quality of full-paper submissions was very high. We think this is largely due to the quality of makeup of the program committee, and the growing recognition of HLT/NAACL as an outlet for good work at the intersection of NLP, IR and Speech. Reviewing was done using a two-tiered system, Area Chairs and Reviewers. Twenty Area Chairs were responsible for a topical area and coordinated the reviewing process (recruiting reviewers, assigning papers to reviewers, managing reviews and attending the PC meeting) in those areas. The Area Chairs are listed at the end of this section. The same review committee handled both Long papers (8 pages) and Late-breaking papers and posters (4 pages). The Co-Chairs made an initial assignment of submissions to Area Chairs. There was a face-to-face PC meeting for Long papers. Because of the different backgrounds of participants, we spent the first hour in a calibration process, looking at top-middle-and bottom papers from each discipline. This worked well, and made many of the subsequent discussions easier. Final decisions for Late-breaking papers were made by conference call. Some Late-breaking papers were selected to be presented orally and others as posters. Reviewing for both Full and Late-breaking papers was blind. In cases where PC or Area Chairs were authors of a paper or papers were from their institution or former students or collaborators, others handled the reviewing process; persons with any conflict of interest left the room during discussions of a paper and these persons had no influence on the paper's final disposition, both for Long and Short submissions. Because of the conference time of early May, submissions were due in the fall, which is a busy time for other conferences as well. Long paper reviewing was carried out over the Dec/Jan holidays, but it worked out pretty well. Late-breaking papers were due after the decisions for Long papers were announced, thus allowing people to resubmit if desired (and, in fact, some were encouraged to do so by the initial reviewers). The turn-around time for Late-breaking papers was tight. There was only one month between the submission deadline and notification, and only ten days between notification and final camera-ready papers were due, but things worked smoothly. Daniel handled the selection and ongoing maintenance of the reviewing software, which was quite a task. He selected the START conference reviewing software. In general the software worked well, although there was some inflexibility which we had to work around. The main issue had to do with sharing of files by Area Chairs. Because there was some overlap of areas, sub-groups wanted to share responsibilities, but this was difficult to do explicitly. Overall, the reviewing software was adequate and the customer support outstanding. Some additional functionality would be desired, but that is true for any software tool. Most of the problems in using the software package occurred because the PC chairs and area chairs did not put sufficient time into reading the associated documentation in advance. We recommend using this software package for future HLT/NAACL conferences. Based on the distribution of submissions from last year, we selected twenty area chairs. Some area chairs covered the same nominal area, which we did to balance the anticipated load. We could have chosen to further sub-divide some areas like Information retrieval, but this seemed risky given the difficulties in predicting the nature of submissions. Most of the Area Chairs have pretty broad backgrounds which provided us added flexibility in managing submissions. In general the load balancing worked pretty well, and we tried to assign roughly 10 Long Papers and 5 Short Papers to each Area Chair. We could have used another Area Chair in Discourse/Dialog and Syntax/Semantics, and one fewer in Learning and Information Retrieval. There were more Speech papers submitted as Short Papers than Long Papers. Based on the high quality of submissions, we decided to present a Best Paper Award. The paper we selected received all 5's from reviewers and the area chair. The entire PC reviewed the paper, and thought that it represented a strong mix of theory and practice and was deserving of the award. The Best Paper Award for HLT/NAACL 2004 is awarded to: Catching the drift: Probabilistic content models, with applications to generation and summarization, Regina Barzilay (MIT) and Lillian Lee (Cornell University). We invited two Keynote Speakers, both of whom apply HLT tools and techniques to large-scale, commercial applications. The first keynote address by Dr. Andrei Broder, entitled "Ten years of Web Search Technology", will give an overview the evolution of web search over the past decade, how users' expectations are evolving based on their use of web search technology, and implications of this work in the enterprise search arena. The second keynote address by Dr. Jill Burstein, entitled "Automated Essay Evaluation: From NLP research through deployment as a business", will describe the development of technology for automatic essay evaluation and its deployment as a business. The Area Chairs, Affiliations (Area) were: Srinivas Bangalore, AT&T Labs (Syntax/Semantics) Charlie Clarke, University of Waterloo (Information retrieval) Sadaoki Furui, Tokyo Institute of Technology (Speech) Jim Glass, MIT (Speech) Joshua Goodman, Microsoft Research (Language modeling/Learning) Warren Greiff, MITRE (Information retrieval) Ralph Grishman, NYU (Information extraction) Sanda Harabagiu, University of Texas, Dallas (Question answering) Don Hindle, Primus Knowledge Systems (Syntax/Semantics) Candy Kamm, FxPal (Discourse and dialog) Inderjeet Mani, Georgetown University (Generation and summarization) Andrew McCallum, University of Massachusetts (Language modeling/Learning) Kathy McKeown, Columbia University (Generation and summarization) Bob Moore, Microsoft Research (Machine translation) Hermann Ney, RWTH Aachen (Machine translation) Doug Oard, University of Maryland (Information retrieval) Kishore Papineni, IBM T. J. Watson Research Center (Machine translation) John Prager, IBM T. J. Watson Research Center (Question answering) Brian Roark, AT&T Labs (Syntax/Semantics) Roni Rosenfeld, CMU (Language modeling/Learning) 4. Summary of paper quality and acceptances The number of submissions increased only slightly over last year, and the quality of submissions was excellent. We received 168 submissions for full papers, of which 43 were accepted, resulting in a highly competitive acceptance rate of 26%. (Thirty-nine submissions received an average score of 4.0 or higher, on a reviewing scale of 1 to 5.) In addition, we received 84 submissions for the late-breaking papers track, of which 40 were accepted. Half of the short papers are presented as short talks and others as posters. 5. Publications The distribution of work for putting together the proceedings and the online schedule could be improved. Working with two proceedings chairs (one for Full papers and one for Late-breaking papers) and a web master complicates the logistics. More importantly, exactly who needed what information and when it was needed was not clear. We started getting requests for the final lists and schedule before the Late-breaking papers had been decided on. It would also have been ideal to have a single agreed upon format for the information so that everyone was working from the same data source. With many slightly different versions/formats, there are several errors that get introduced. And, it would have been good to allow authors to review the information that was going to appear in the proceedings and CDs ahead of time. 6. Areas for Improvement We did our best to merge the different goals of the communities, and we think we succeeded for the most part. We also worked hard to solicit involvement from all of the associated communities, and we are especially pleased about the interdisciplinary nature of many of the papers. As the conference becomes better known, we hope to see this trend continue. We feel that, although there is room for improvement, the HLT/NAACL merge was a success in attracting high quality work and in bridging gaps, and hope to see it continue next year. Most of our decisions had to be approved by a large and diverse committee representing NAACL, HLT, ISCA, and SIGIR, as well as different government sponsors of HLT research. The situation was cumbersome at times, but usually issues were resolved within a few days. Minimizing the number of issues that the committee needs to approve would streamline the process. We were invited to help in organizing HLT/NAACL 2004 in early June 2003. This is somewhat later than desirable since the call for participation, and recruiting of reviewers needed to take place very quickly. It would have been better to do this before the HLT/NAACL 2003 meeting, so initial observations and coordination could have started at the preceding conference. 7. Profiles of Submissions Below we summarize the makeup of the Full paper and Late-breaking paper submissions, in terms of international representation and in terms of topic distributions. The keywords were selected from a pre-defined list by the contact author. ------------------------------------------------------------------- ----- Full Paper Keywords ----- ------------------------------------------------------------------- Number of submissions: 168 Number of acceptances: 43 Average number of keywords per paper: 3.0 Keyword counts: 10 Anaphora resolution 3 Cross language information retrieval 11 Dialogue structure and dialogue systems 13 Discourse 21 Evaluation 22 Human Language Applications 27 Information extraction 20 Information retrieval 9 Language generation 19 Language modeling 25 Lexical and knowledge acquisition 18 Lexicons and ontologies 26 Machine translation of speech and text 2 Message and narrative understanding systems 5 Morphology 17 Multilingual processing 4 Multimodal representations and processing 11 Natural language interfaces 18 Other 18 Parsing 2 Phonology 1 Pragmatics 13 Question answering 5 Rich transcription 23 Semantics 13 Speech recognition 72 Statistical and learning techniques 1 Style 10 Summarization 13 Syntax 19 Tagging 1 Text planning 2 Text to speech 21 Tools and resources 12 Treebanks, proposition banks, and frame banks ------------------------------------------------------------------- Full papers, countries (of contact author only): 1 Brazil 5 Canada 1 China 1 Czechoslovakia 1 Denmark 1 France 8 Germany 1 Greece 1 Hong Kong 1 India 3 Ireland 2 Italy 11 Japan 0 Mexico 0 New Zealand 1 Portugal 0 South Korea 0 Spain 1 Sweden 1 Thailand 11 UK 117 US ------------------------------------------------------------------- ------ Short Paper Keywords --------------- ------------------------------------------------------------------- Number of submissions: 84 Number of acceptances: 39 (20 oral presentation; 19 poster presentation) Average number of keywords per submission: 3.1 ------------------------------------------------------------------- Keyword counts: 1 Anaphora resolution 4 Cross language information retrieval 9 Dialogue structure and dialogue systems 7 Discourse 8 Evaluation 14 Human Language Applications 15 Information extraction 15 Information retrieval 2 Language generation 16 Language modeling 6 Lexical and knowledge acquisition 6 Lexicons and ontologies 4 Machine translation of speech and text 2 Message and narrative understanding systems 2 Morphology 7 Multilingual processing 5 Multimodal representations and processing 6 Natural language interfaces 11 Other 6 Parsing 2 Phonology 1 Pragmatics 2 Question answering 9 Rich transcription 13 Semantics 20 Speech recognition 30 Statistical and learning techniques 5 Summarization 4 Syntax 7 Tagging 1 Text planning 2 Text to speech 11 Tools and resources 5 Treebanks, proposition banks, and frame banks Short papers, countries (of contact author only): 0 Brazil 0 Canada 0 China 0 Czechoslovakia 0 Denmark 1 France 0 Germany 1 Greece 3 Hong Kong 0 India 1 Ireland 1 Italy 5 Japan 1 Mexico 1 New Zealand 0 Portugal 1 South Korea 1 Spain 3 Sweden 0 Thailand 1 UK 63 US ------------------------------------------------------------------- -------------------------------------------------------------------- ----- Roll Ups (NLP, IR, Speech, All) for Keywords ----- Attached spreadsheet allows variation in assignment of area to keyword -------------------------------------------------------------------- Long Submissions Short Submissions Num Percent Num Percent Keyword 10 0.020 1 0.004 NLP Anaphora resolution 3 0.006 4 0.016 IR Cross language information retrieval 11 0.022 9 0.035 NLP Speech Dialogue structure and dialogue systems 13 0.026 7 0.027 NLP Discourse 21 0.041 8 0.031 All Evaluation 22 0.043 14 0.054 All Human Language Applications 27 0.053 15 0.058 NLP Information extraction 20 0.039 15 0.058 IR Information retrieval 9 0.018 2 0.008 NLP Language generation 19 0.037 16 0.062 All Language modeling 25 0.049 6 0.023 NLP Lexical and knowledge acquisition 18 0.036 6 0.023 NLP Lexicons and ontologies 26 0.051 4 0.016 Speech Machine translation of speech and text 2 0.004 2 0.008 NLP Message and narrative understanding systems 5 0.010 2 0.008 NLP Morphology 17 0.034 7 0.027 All Multilingual processing 4 0.008 5 0.019 NLP Multimodal representations and processing 11 0.022 6 0.023 All Natural language interfaces 18 0.036 11 0.043 All Other 18 0.036 6 0.023 NLP Parsing 2 0.004 2 0.008 NLP Phonology 1 0.002 1 0.004 NLP Pragmatics 13 0.026 2 0.008 IR NLP Question answering 5 0.010 9 0.035 All Rich transcription 23 0.045 13 0.050 NLP Semantics 13 0.026 20 0.078 Speech Speech recognition 72 0.142 30 0.116 All Statistical and learning techniques 1 0.002 0 0.000 NLP Style 10 0.020 5 0.019 NLP IR Summarization 13 0.026 4 0.016 NLP Syntax 19 0.037 7 0.027 NLP Tagging 1 0.002 1 0.004 NLP Text planning 2 0.004 2 0.008 Speech Text to speech 21 0.041 11 0.043 All Tools and resources 12 0.024 5 0.019 NLP Treebanks, proposition banks, and frame banks 507 258 Roll Ups Long Submissions; Short Submissions 237 101 NLP 46 26 IR 52 35 Speech 206 112 All -------------------------------------------------------------------- ----- Roll Ups (NLP, IR, Speech, All) for Keywords ----- Attached spreadsheet allows variation in assignment of area to keyword -------------------------------------------------------------------- Long Submissions; Short Submissions'; Area Chair 13 6 NLP Srinivas Bangalore, AT&T Labs (Syntax/Semantics) 8 4 IR Charlie Clarke, University of Waterloo (Information retrieval) 6 9 Speech Sadaoki Furui, Tokyo Institute of Technology (Speech) 6 9 Speech Jim Glass, MIT (Speech) 7 6 NLP IR Joshua Goodman, Microsoft Research (Language modeling/Learning) 7 3 IR Warren Greiff, MITRE (Information retrieval) 10 4 NLP Ralph Grishman, NYU (Information extraction) 10 2 NLP IR Sanda Harabagiu, University of Texas, Dallas (Question answering) 13 2 NLP Don Hindle, Primus Knowledge Systems (Syntax/Semantics) 14 NLP Speech Candy Kamm, FxPal (Discourse and dialog) 8 3 NLP Inderjeet Mani, Georgetown University (Generation and summarization) 7 6 IR Andrew McCallum, University of Massachusetts (Language modeling/Learning) 8 3 NLP Kathy McKeown, Columbia University (Generation and summarization) 7 2 NLP Bob Moore, Microsoft Research (Machine translation) 6 0 NLP Hermann Ney, RWTH Aachen (Machine translation) 5 5 IR Doug Oard, University of Maryland (Information retrieval) 7 2 NLP/Speech Kishore Papineni, IBM T. J. Watson Research Center (Machine translation) 9 2 NLP IR John Prager, IBM T. J. Watson Research Center (Question answering) 12 3 NLP Brian Roark, AT&T Labs (Syntax/Semantics) 5 5 NLP Roni Rosenfeld, CMU (Language modeling/Learning) ------ 168 83 Roll Ups Long Submissions; Short Submissions 129 47 NLP 53 28 IR 33 27 Speech (Return to top of HLT-NAACL report; return to top of minutes) Student Workshop Chairs Report Student organizers: Nicola Stokes, University College, Dublin Karen Livescu, MIT Ani Nenkova, Columbia University Faculty co-advisors: Amanda Stent, Stony Brook University Eric Fosler-Lussier, Ohio State University Lisa Ballesteros, Mount Holyoke College Responsibilities of the student organizers: Advertise the workshop Contact reviewers and panelists Manage reviewing and make acceptance/rejection decisions Contact with authors Responsibilities of the faculty advisors: NSF funding for authors to attend the workshop Advising the organizers Number of papers submitted: 12 (9 NLP, 3 IR/speech/other [bioinformatics]) Number of papers accepted: 10 (7 NLP, 3 IR/speech/other) 1 paper was received too late for consideration. We received NSF funding: One author per paper is funded to attend the workshop. Comments from the organizers: 1) The workshop is a lot of work. Multiple student organizers are definitely needed. 2) This is the second HLT/NAACL student workshop, and for the second time the number of submissions was low. In addition to advertising through ACL mailing lists, the organizers contacted universities directly by email, but this did not seem to help. Consequently, the acceptance rate was high. One organizer commented: "I can see two reasons for such a small number of submissions--1) HLT has the short paper session, as well and a demo session, so many people prefer to submit there rather than at a separate student workshop. 2) the usual set-up of student/advisor relationship in the USA makes it very difficult for the student to have a publication without their advisor, thus the student session is not an option (and this is also why the short paper/poster session will be preferred)." We think that perhaps if funding was guaranteed, and could therefore be widely publicized with the CFP, there might be more submissions. However, this would not be easy to achieve. The low number of submissions is a big problem for the student workshop. 3) There are too many faculty advisors, and their roles are not clear. 4) It would be nice to get reviewer suggestions from the program chairs for the main conference. Comments from the faculty advisors: 1) NSF funding is important for the student workshop. However, there is some tension here: NSF is primarily interested in funding domestic students, but HLT might like to attract international students to the student workshop (and it is expensive to bring these students here). 2) One author was unable to get a visa. The short time period between paper acceptance and the conference is not a help here. 3) The roles of the faculty advisors were not always clear, so sometimes responses to the organizers were delayed because all advisors were waiting for someone else to offer advice first, not wanting to step on anyone's toes. (Note from Amanda: I should have delineated clear lines of responsibility.) However, there is too much work for one faculty advisor, when the need to get funding is taken into account. 4) The student organizers did a wonderful job. They collected papers, communicated with the authors, coordinated the reviewing, collected the results and made the final decisions about acceptance/rejection. They also spent a long time communicating with potential panelists. Special thanks to Ani for contacting panelists, to Karen for collecting and collating reviews, and to Nicola for setting up the schedule. Advice for future organizers/advisors: 1) Start on the proposal early, as many questions will have to be sorted out (money required for the room, for proceedings etc.). 2) Start collecting reviewers and panelists early. 3) In order to increase the number of submissions, it is important to advertise widely and often. Also, consider ways to increase the prestige of the student session so that more people will submit. 4) Make a firm policy regarding international students, especially whether there will be more funding for an international paper than for a domestic one (and where it will come from). 5) Picking a day when a good audience can be expected is a guessing game. We chose the [tutorial] day, the day before the conference. However, other student workshops/sessions are during the main conference. (Return to top of HLT-NAACL report; return to top of minutes) Publications Chairs Report Miles Osborne, Edinburgh University Katrin Kirchoff, University of Washington Gina Levow, University of Chicago In summary, things went smoothly. We divided the task into three: Gina dealt with workshop related matters, Katrin with the second volume and I dealt with the main volume, CDROM and overall co-ordination. This division of labour worked well. Our job was considerably eased by Drago's help and availability of scripts to (semi) automate the task. Additionally, creating a mailing list helped communicating with 20+ people. All told, I probably spent a week of my time on publications. Most of this time was spent on creating hardcopy details for the main volume and answering many email queries. I also re-organised the publications software a little, bugfixed and wrote detailed HOWTO instructions. One problem encountered was timing. I should have given all workshop organisers an earlier deadline and so would have allowed more time for me to assemble the CDROM properly. The CDROM contains all information, so it can only be created when everyone is done. This year, although I had most material a few days before the courier was due to pick everything up, I was still receiving material 15 minutes before I was due to burn it! Also, at least one workshop only started the publication process on the very last day. This caused Gina a lot of needless work. Omnipress are good publishers and proved helpful: they should be used for next year. (JH: One major problem occurred the week before the conference: Omnipress contacted us frantic because they had misnumbered the pages in the companion volume. Everything had already been printed when they discovered that the table of contents, which we had provided, was not followed in the numbering of paper pages. They asked for the TOC and renumbered it to correspond to the numbering of the papers and reprinted those TOC pages, so hopefully this will all be fine. They did respond to this problem (which was entirely their fault) swiftly.) --Miles Osborne (Return to top of HLT-NAACL report; return to top of minutes) Tutorial Chairs Report Alex Acero, Microsoft Research Jamie Callan, CMU Andy Kehler, UCSD We went into this year's tutorial planning with a target of six tutorials, to be spread as evenly as possible over the three areas of Speech Processing, Information Retrieval, and Text Processing, and with a hope of attracting some that tied two or more of these together. We received nine submissions as a result of the call and direct solicitations. Accepts ------- 1. Finite-state Language Processing Shuly Winter 2. Graphical Models in Speech and Language Research Jeff Bilmes 3. Statistical Language Models and Information Retrieval ChengXiang Zhai 4. Large-Scale Spoken Document Retrieval Pedro J. Moreno and Jean Manuel Van Thong 5. What's New in Statistical Machine Translation Kevin Knight and Philipp Koehn 6. Semantic Inference for Question Answering Sanda Harabagiu and Srini Narayanan Three proposals were rejected due to topic overlap considerations or concerns about the size of the potential audience. Proposals 3, 4, and 8 were the result of direct solicitation, the remainder responded to the call. This left our target number of 6, which achieved a good spread over the three areas of interest, with several achieving coverage of more than one area. In one or two cases, we encouraged presenters to include additional material so that more than one of the three areas would be represented. Proposal 4 originally had a different title and three presenters; we suggested a change in title and a reduction in presenters to one or two. In their post-mortems, the HLT and NAACL Executive Committees might want to evaluate our approach to selecting tutorials, and give explicit guidance to next year's Tutorial Chairs on how to balance expected draw with other desiderata. The main difficulty we had this year was getting the requested materials from presenters in accordance with the deadlines specified in the call and in their acceptance letters. Blurbs in ASCII and HTML were due on Feb 15, but additional reminders had to be sent after that date. (There was also a significant delay between the time this material was submitted to the webmaster and when it appeared on the conference website.) The deadline for submitting tutorial slides for reproduction was originally March 17, which none of the presenters met. This deadline was extended to March 31, with several presenters still missing it. Finally after an ultimatum was issued for an April 15 deadline, we received the last of them on that date. In retrospect, the original deadline of March 17 was probably too early to have expected presenters to be done with their slides, so in the future we'd recommend a date that is closer to the conference, coupled with timely reminders sent to the presenters. (Return to top of HLT-NAACL report; return to top of minutes) Workshop Chairs Report Richard Sproat, University of Illinois, Champagne-Urbana Bhuvana Ramabhadran, IBM T. J. Watson Research Alan Smeaton, Dublin City Universtiy 1. Submissions We received 11 workshop proposal submissions, of which we accepted 10. The one reject was rejected because it was felt that the proposal was too sketchy. 2. Accepted Workshops and Projected Attendance After the paper submission deadline, the following two workshops were merged due to lower than expected submissions: - Spoken Language Understanding for Conversational Systems - Higher-Level Linguistic and Other Knowledge for Automatic Speech Processing The combined workshop is listed as WS9 below. There were 3 day workshops and six one day workshops. Projected attendance at each of these workshops, based on registrations as of April 13, 2004, is given in parentheses after each workshop: WS1 CoNLL-2004: Eighth Conference on Computational Natural Language Learning Thursday and Friday May 6 and 7, 2004 (58) WS2 Workshop on Pragmatics of Question Answering Thursday and Friday May 6 and 7, 2004 (21) WS3 Document Understanding Conference 2004 Thursday and Friday May 6 and 7, 2004 (41) WS4 Workshop on Frontiers in Corpus Annotation Thursday May 6, 2004 (20) WS5 Workshop on Computational Lexical Semantics Thursday May 6, 2004 (30) WS6 Second International Workshop on Scalable Natural Language Understanding (ScaNaLU 2004) Thursday May 6, 2004 (22) WS7 Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval Thursday May 6, 2004 (15) WS8 Linking Biological Literature, Ontologies and Databases: Tools for Users Thursday May 6, 2004 (35) WS9 Spoken Language Understanding for Conversational Systems and Higher-Level Linguistic Knowledge for Automatic Speech Processing Friday May 7, 2004 (44) (Return to top of HLT-NAACL report; return to top of minutes) Demo Chairs Report Joseph Polifroni, Unveil Technologies David Palmer, Virage Deb Roy, MIT Media Lab We received a total of 22 submissions for the demo session. We accepted 19 demos, with one withdrawal. As of the deadline for submission, we had 20 demo proposals. After sending out targeted inquiries to approximately 10 sites, we received two more proposals. Overall, we think we have a good balance among IR/Summarization type demos, speech/dialogue demos, translation demos and what I'm calling NL demos (i.e., NL- flavored demos that weren't as systems-oriented as the other demos). The one area in which we are weak is speech synthesis, although several of the speech/dialogue demos have synthesis systems embedded within them. We targeted several synthesis sites and sent email inquiries, but the people involved either didn't respond or couldn't travel to Boston at the time of the conference. After sending out acceptances, we solicited papers for each system and received 12 papers in response (indicated with (P) below next to the demo system). These papers, which describe the demo systems will appear in the final proceedings. For the demo plenary session, which has a 45-minute time slot and is being shared with an awards presentation, we selected 2 demos, "ITSPOKE: An Intelligent Tutoring Spoken Dialogue System," from the University of Pittsburgh, and "A Thai Speech Translation System for Medical Dialogs," from Carnegie Mellon University. In addition to these two demos, the demo co-chairs will present an overview of the demos to be presented in the session, based on input from each presenter. Below is the total list demo [acceptances], divided by area. --IR/Summarization Alias-I ThreatTrackers Breck Baldwin and Bob Carpenter breck@alias-i.com Columbia Newsblaster: Multilingual News Summarization on the Web David Kirk Evans, Judith L. Klavans, Kathleen R. McKeown devans@cs.columbia.edu (P) FASIL Email Summarisation System Angelo Dalli, Yunqing Xia, Yorick Wilks a.dalli@dcs.shef.ac.uk MiTAP for SARS Detection Laurie Damianos, Samuel Bayer, Michael A. Chisholm, John Henderson, Lynette Hirschman, William Morgan, Marc Ubaldino, Guido Zarrella, James M. Wilson, Marat Polyak laurie@mitre.org (P) Multilingual Video and Audio News Alerting David D. Palmer, Patrick Bray, Marc Reichman, Katherine Rhodes, Noah White, Andrew Merlino, Francis Kubala dpalmer@virage.com (P) A Scaleable Multi-document Centroid-based Summarizer Dragomir Radev, Timothy Allison, Matthew Craig, Stanko Dimitrov, Omer Kareem, Michael Topper, Adam Winkel, Jin Yi radev@umich.edu (P) --Speech/Dialogue Demonstrations of Perceptive Animated Agents that Teach Children to Read and Learn from Text Ronald Cole, Sarel van Vuuren, Bryan Pellom, Kadri Hacioglu, Wayne Ward, Dan Jurafsky, Jiyong Ma, Jie Yan, Justin Post, Nattawut Ngampatipatpong, Jariya Tuantranont, Javier Movellan, Marian Bartlett Stewart cole@cslr.colorado.edu A Framework for Developing Conversational User Interfaces Eugene Weinstein, Scott Cyphers, James Glass, Grace Chung ecoder@csail.mit.edu ITSPOKE: An Intelligent Tutoring Spoken Dialogue System Diane J. Litman and Scott Silliman litman@cs.pitt.edu (P) Spoken Dialogue for Simulation Control and Conversational Tutoring Elizabeth Owen Bratt, Karl Schultz, Brady Clark ebratt@csli.stanford.edu (P) --Translation/Speech A Thai Speech Translation System for Medical Dialogs Tanja Schultz, Dorcas Alexander, Alan W. Black, Kay Peterson, Sinaporn Suebvisai, Alex Waibel tanja@cs.cmu.edu (P) Language Weaver Arabic-to-English Demo Alex Fraser, Laurie Gerber, Kevin Knight, Daniel Marcu, Franz Josef Och, William Wong lgerber@languageweaver.com Limited-Domain Speech-to-Speech Translation between English and Pashto Kristin Precoda, Horacio Franco, Ascander Dost, Michael Frandsen, John Fry, Andreas Kathol, Colleen Richey, Susanne Riehemann, Dimitra Vergyri, Jing Zheng, Chris Culy precoda@speech.sri.com (P) --OCR [withdrawn] --NL Open Text Semantic Parsing Using FrameNet and WordNet Lei Shi and Rada Mihalcea rada@cs.unt.edu (P) SenseClusters - Finding Clusters that Represent Word Senses Amruta Purandare and Ted Pedersen pura0010@d.umn.edu (P) Use and Acquisition of Semantic Language Model Kuansan Wang, Ye-Yi Wang, Alex Acero kuansanw@microsoft.com (P) A Visually-Oriented, Context-Engaged Language Learning and Communication Aid Rupal Patel and Sam Pilato R.Patel@neu.edu WordNet::Similarity - Measuring the Relatedness of Concepts Ted Pedersen, Siddharth Patwardhan, Jason Michelizzi tpederse@d.umn.edu (P) (Return to top of HLT-NAACL report; return to top of minutes) Sponsorship and Exhibits Chairs Report Roberto Pieraccini, IBM. T.J.Watson Research Center Douglas Jones, MIT, Lincoln Labs Between October and November 2003 we contacted 48 research and commercial organizations and 8 publishers (shown in the table below) from a list that was initially provided by Deborah Dahl, former sponsorship chair of ACL 2002, and integrated with new entries. Ectaco SRA Kluwer Academic Publishers YY Software Loquendo IBM BBN Motorola Hughes Ontology Works West Group Springer-Verlag ELRA AskJeeves Microsoft cogentex iPhrase Sun Lawrence Erlbaum Intelligent Information Systems BabelTech InfoSpace (locus dialog) Boeing Nuance Inxight MIT Press Xerox PARC University of Chicago Press General Electric Apptek Microsoft eMotion LDC Teragram Morgan Kaufman Daimler-Benz Lexis-Nexis Intel (China) AT&T Philips SemanticEdge John Benjamins XRCE Neospeech Google ARDA Mitsubishi Homeland Security Systran Transclick Oxford University Press Edify alias-i Lockheed Martin Canon Scansoft Publisher were asked whether they wanted to have an exhibit table; we offered them the option of sending the material, with no people from the publisher company actually attending the event, and having a student take care of arranging the exhibit and collecting the order. The 48 research and commercial organizations were offered the following sponsorship levels: The benefits offered to the sponsors were as follows: Additional Gold Sponsor benefits include: Additional Silver Sponsorship level benefits include: Additional Bronze level benefits include: Among the organization that responded to the request, 10 committed to actually sponsor the conference for a total of $25,000. IBM sponsored the Student Party with its funding. ACM and MIT press will send publishing material that will be shown at the exhibit. The general program chair conferred with the NAACL exec / HLT board to extend complimentary exhibit space to bronze sponsors as a one-time offer. MITRE was also encourage to use free exhibit space as an in-kind response for Christy Doran's time spent for local arrangements. (Return to top of HLT-NAACL report; return to top of minutes) Publicity Chairs Report Shri Narayanan USC Peter Anick, Overture Peter Heeman, OHSU The publicity chairs have two major roles: The Publicity chairs focused on pre-conference publicity through email lists, hard-copy posters, and endorsements. Email lists: email was sent for the Conference Announcement, Call for Tutorial Proposals, Call for Papers, Student Call for Papers, Workshop Call for Papers. They were sent to the ISCA newsletter (editor Chris Wellekens), ACL mailing list (via Priscilla Rasmussen), the IEEE Speech Technical Committee Newsletter (editor Richard Rose), SIGIR-ANNOUNCE mailing list and the IR-LIST. Posters: We produced hard-copy posters for the conference, modeled after the ASRU 2003 poster. Members of the Organizing Committee were asked to distribute copies of the posters. The posters were mailed out in mid-November, and mailed to academic, industry laboratories both in the USA and abroad. Endorsement: were received from SIGIR and ISCA. (Return to top of HLT-NAACL report; return to top of minutes)