TOWARDS A NEW ETHICS IN BUILDING

Following the “Cognitive Building” concept, in a few years, building automation systems were drastically improved to collect a large amount of user data. However, despite this evolution and the research efforts in the field, human-building interaction remained one of the least mature fields of building science due to the occupants’ complexity and diversity. Collecting data became simple and cheap, but transforming collected “data” into valuable “information” able to create an effective interaction between buildings and occupants remains complex. This work contributes by proposing a method to translate unstructured data, coming from Computerized Maintenance Management System (CMMS), into information useful to improve the interactions between occupants and buildings in the management of the maintenance process. End-users’ maintenance requests, collected through a CMMS, were used to create a technical sentiment lexicon able to predict the priority of an intervention based on an inverted naïve Bayes approach. Sentiment lexicons are part of sentiment analysis, an interesting research field introduced to study people’s opinions, sentiments, emotions, and attitudes through Natural Language Processing (NLP). The technical lexicon is useful to immediately perform the priority assessment of contemporary end-users’ maintenance requests, thus being more rapid than traditional Machine Learning methods.


INTRODUCTION
No more than ten years ago, the "cognitive building" concept was introduced, based on the idea that the rapid evolution of ICT and the availability of low-cost IoT devices would transform in a short time buildings from passive "containers" of occupants' activities to interactive environments. The concept evolved quickly, and the idea of the "digital twin" emerged to express the possibility of predicting the interaction between occupants and build-ings in a digital replica of the real building. Following these concepts, and thanks to the large availability of lowcost sensors [1], building management activities were substantially improved during the last years [2], and CMMS (Computerized Maintenance Management System) was extensively introduced to manage building stocks.
CMMSs can collect end-user's maintenance requests, trace the management process, and collect information can even imply positive or negative sentiments or opinions of their authors, and many words or sentences may have opposite orientations or polarities in different application domains [7]. Sentiment analysis methodologies have been applied to analyze several aspects of the building management process [13] and to collect information about people's preferences and concerns about energy policies [14].
More recently, Natural Language Processing (NLP) models were also applied to the facility management of buildings, collecting sentiments and opinions from end-users [4], in order to improve the building operability and the cost of the management process [15]. Bortolini and Forcada developed a methodology, based on the TF analysis of words expressing the severity degree, to determine the typical problems that end-users complain about the building systems and their perceived severity [9]. Gunay et al. analyzed operators' work order descriptions in CMMS, extracting information about failure patterns in building systems and components [16]. The results provided insights into equipment breakdown of failure events, top system and component-level failure modes, and their occurrence frequencies. Bouabdallaoui et al. proposed a machine-learning algorithm based on NLP to manage day-to-day maintenance activities [17]. Sexton et al. compared NLP methodologies to extract keywords from maintenance Work orders [18]. Two emotion lexicon databases, the Ho-Liu database [6] and the NRC emotion lexicon, were used to extract data from semi-structured interviews and focus group discussions regarding housing management in India. Sentiment lexicons were convenient since they were much faster and less computationally intensive than Machine Learning (ML) methods. Indeed, just consider that the training and testing process of an ML method, such as an LSTM or BiLSTM RNN, normally used to manage this type of data, could require hours or days depending on the dimension of the dataset and the number of dedicated GPUs or CPUs.
Several general-purpose subjectivities, sentiment, and, also, emotion lexicons have been realized and are publicly available [6,11,12,19,20], but the accuracy of proposed methodologies and lexicons should be properly evaluated when applied to specific domains or to extract specific aspect-based sentiments. about the state of the systems and the occupants' perception. CMMSs can manage unstructured data [3,4], typically e-mails written by non-technical end-users to give information about a specific issue. However, due to the use of the natural language and the characteristics of the occupants (e.g. level of experience with buildings, systems and components), this information is often imprecise and often mixed with personal perceptions about the system's state. Thus, the translation of the end-users' request into a work order (to solve the problem) by technicians is not a simple task, often requiring more time to manage the request than to solve the problem. Furthermore, the complexity of such a task increases depending on the number of daily contemporary requests, which can be particularly high in case of wide organizations hosted in large building stocks.
In other fields, text mining tools were introduced to solve this type of issue, thanks to their ability to discover hidden knowledge from massive and complex data stored in databases or other information repositories [5]. Text mining methods were used to translate occupants' feedback, comments and reviews on social and commercial platforms (data) into "information" useful to improve products and services [6].
Among the text mining methods, sentiment analysis (SA) recently received particular attention [7]. SA studies people's opinions, sentiments, emotions, and attitudes, often employed to extract opinion polarity and degree from different sources [8]. SA was also introduced in facility management to collect information about the status of building systems directly from end-users' perceptions [9] and to improve preventive maintenance strategies [10].
The most important indicators of sentiments are the "opinion" words [6], but also phrases and idioms can express sentiments [7]. A list of such words and phrases is called a sentiment "lexicon". Several general lexicons have been realized and are available, i.e. General inquirer lexicon, HU-LIU lexicon [6], MPQA subjectivity lexicon [11], SentiWordNet [12], and also emotion lexicons. Over the years, researchers have designed numerous algorithms to compile and improve such lexicons, considering that "opinion" words may have opposite orientations or polarities in different application domains or sentence contexts. Sentences without "opinion" words tion and comfort issues (e.g. indoor temperature, IAQ) as well as on the safety of the users (e.g. referring to safety systems, elevator, etc.).

RESEARCH FRAMEWORK
A mathematical approach is proposed based on the Naïve Bayes theorem to develop the technical lexicon. In particular, the theorem has been inversely applied, deriving the polarity of technical words when used together from the priority scores assigned to a subset of a corpus of end-users' maintenance requests manually annotated by technicians with different expertise. The corpus comprises more than 12000 end-users maintenance requests. These requests were generated over 34 months by the personnel employed at Università Politecnica delle Marche (Ancona, Italy) and collected into the CMMS managed by the facility management general contractor (ANTAS spa). The Human Manual Annotation (HMA) process is based on a BWS (best-worst-scale) approach. 20 annotators, with three different expertise levels (high, mean, low), depending on their skills and work experience in the construction field, were presented with several 4-tuples of requests and asked to select the most positive one and the most negative one. 10 high-level, 4 mean-level, and 6 low-level annotators participated in the HMA task. A random subset of sentences has been extracted from the dataset, respecting the proportion of sentences by category type. 150 distinct 4-tuples were randomly generated through the "bwstuples" python script (http:// valeriobasile.github.io/) so that each term was seen in five different 4-tuples. The score is given by the ratio between the number of times an item is chosen as "best /worst" and the number of times it appears [25]. The concordance level due to the different annotators has been checked through a correlation analysis based on the spearman method. To be robust with respect to possible outliers, the work of each annotator characterized by a rho-spearman correlation coefficient < 0.8 (for the mean value of the group of annotators) has been discarded, and the Krippendorfs' alpha coefficient was calculated. The K-alpha coefficient measures the concordance degree used in several fields.
Several studies have been performed to check the concordance of different lexicons in different domains [21]. Various combinations of existing lexicons and NLP tools have been evaluated against a human-annotated subsample, which serves as a gold standard. Cambria et al. described several comparative works based on human annotation approaches (Best-Worst, MaxDiff) [7].
However, despite a significant amount of research, challenging problems remain. A general, quick-to use and effective method for discovering and determining domain and context-dependent sentiments is still lacking [22].
D'Orazio et al. [23] demonstrated that SA is a powerful method to extract information from unstructured data but also that general and easy-to-apply word-based lexicons commonly used in other contexts cannot be simply applied to the field of building facility management because "technical" words are not recognized as "opinion" words in other fields. An example is represented by the words "falling" and "ceiling". These words express a serious problem for a technician when they jointly occur in a maintenance request, but this connection seems to be not properly recognized by lexicons, even if they are in the same polarized cluster [23]. Then it is necessary to develop technical lexicons to correctly apply these methods.
Given the above, this work proposes a method to translate unstructured data, coming from CMMS, into information useful to improve the interactions between occupants and buildings in the management of the maintenance process. End-users' (occupants') maintenance requests collected through CMMS are used to create a technical sentiment lexicon able to predict their priority, thus being sensitive to the application context of building maintenance. In particular, the method adopts an inverted naïve Bayes approach [24], which can be powerful by considering (1) the possibility of combining the priority impact of each "technical" word in the lexicon, as well as common words, and (2) the application quickness for the use of real-world data and in real-world conditions, also in respect to other common ML methods. This method, automatizing the priority assignment process, will reduce the time necessary to solve maintenance issues, improving the satisfaction level of the end-users about the functionality of the systems (HVAC, electricity, water, etc.), and thus also including impacts on building opera- given that a is true P(a) = unconditional a priori probability of a without any regard to b P(b) unconditional a priori probability of b without any regard to a Knowing the unconditional probability of "a" and "b" (observed frequency) and the conditional probability of "b" given that "a" is true, it is possible to calculate the conditional probability of "a" given that "b" is true. To clarify the concept and show the application to the NLP, a simple example, adapted from [24], is explained considering a binary system (only two possible values for each sentence).
Suppose to have a dataset of 7 annotated sentences, regarding the internal climate, with an attributed polarity score: ⨁ (positive); ⊖ (negative).
Through the calculation of the frequency, we can predict the probability P that a sentence containing the words (it, is, not, nice, cold) will be positively or negatively annotated, using the equations 2-7 obtaining PNB+, PNB-, and CNB, as described in the equations 8-11.
The final score obtained for each of the 150 sentences, for the remaining annotators, is the mean of scores each annotator gives. The number of sentences used in the process has been chosen to grant that more than 90% of the words employed in the whole dataset are part of the subsample.
Thanks to the availability of annotated sentences, a mathematical reformulation of the naïve Bayes approach has been performed to extract the specific technical polarity content of each word. A method to calculate the technical-specific content has been proposed in Section 3 by assuming that: (1) the scores attributed by the technicians to each specific sentence are due to the words contained in the sentence, and that (2) the technical words are also opinion words.
Finally, the method has been applied to the extracted dataset to derive the proposed technical lexicon. R statistics (rel. 4.3) language has been used to perform the operations necessary to derive the lexicons by applying the mathematical formulation proposed in the following section.

THE NAÏVE BAYES APPROACH AND THE RELATED CRITERIA FOR EFFECTIVENESS ANALYSIS
Naïve Bayes is a well know ML classification method based on the Bayesian theorem. It has been extensively applied to classify data and text and predict a particular state of a system based on conditional probability calculation. In particular, from the number of times an event has occurred, the Naïve Bayes method derives the probability that it may again occur in future trials or observations [24]. The probabilities of the events are computed from their observed frequencies (a-posteriori probabilities).
In particular, Naïve Bayes theorem states that, given two events a and b: where: P(a|b) = conditional probability of observing an event a given that b is true Starting from the assumption that each word comprised into a specific sentence can contribute to the score attributed to each sentence, it is possible to apply the method described in Section 2.2, inverting equations 6 and 7.
In particular, if we consider a dataset of sentences containing a set of words (A, B, … , N), equations 6 and 7 can be rewritten, for each sentence, to obtain equations 12 and 13: The Naïve Bayes approach can also be used to predict the class of a specific sentence, for n-class systems, where n à number of annotated sentences, when each sentence has been "annotated" with a different specific value instead of a category.

THE INVERTED NAÏVE BAYES METHOD FOR END-USERS' MAINTENANCE REQUESTS ANALYSIS
The Naïve Bayes approach described in Section 2.2 can be inverted to obtain from an "annotated" dataset Having a corpus where each sentence is annotated with a specific score, we can simplify equations 12 and X1, 1, …., N, N are the unknown coefficients expressing, for each sentence, the contribution of each word to the score attributed to the sentence. If we now define K N1,2,…N and Z N1,2,…N (equations 16 and 17), the whole set of linear equations appears in the following form (equations 18): It is not possible to directly solve the linear equation set because of the impossibility of finding two sentences composed of the same words, but it is possible to find an approximate solution. Assuming that the contribution of each word to the score of each specific sentence is equal (equations 19), we can write, for sentence 1, equations 20 and 21. Repeating the same process for each annotated sentence, we can obtain a vector Q expressing the contribution of each word contained in each specific sentence to the total score attributed to the sentence (equation requests. Twenty annotators attributed a priority score to the sentences through the BWS approach. The score ranges from -1 to 1 and represents the intervention priority to assign to each end-users' request with respect to the other maintenance requests. A correlation analysis (Fig. 1), based on the spearman method (due to the non-normality of the sample), has been performed to check the concordance between the annotators. For the same sentences, the scores attributed by different annotators concerning each other and the mean value were correlated. Observing the values of the last column (Spearmans' Rho coefficient for each annotator concerning the mean value), three annotators appear strongly diverging from the others (two negative correlations and one not-valuable result). The other annotators attributed similar scores with correlation coefficients in the range of 0.71-0.9. In order to reduce discordances and obtain a unique score for each sentence, the whole work of the annotators diverging from the mean of the group (rho < 0.8) was discarded. The final Krippendorfs' alpha coefficient was then calculated, obtaining a value = 0.714 (> 0.67), thus expressing an adequate concordance level.
Considering that the BWS process, used to annotate the sentences, produces scores ranging from -1 to 1, it is necessary to correct the Q equation extracting the polarity (the sign of the score) to avoid errors as described by the equation 23, where polarity 1 = -1 or 1 (depending on the polarity of sentence 1).
Considering that for the same word we will obtain different Q, for each sentence we will obtain the approximate solution of the unknown coefficients (equation 18), summing Q values obtained for each word and each sentence and dividing the obtained value by the number of sentences comprising the word (equation 24). The approximate solution can be furtherly refined with an iterative process until convergence.

THE MANUAL ANNOTATION PROCESS ON THE CASE STUDY
In order to apply the proposed method, a manual annotation process based on the BWS (best-worst-scale) approach has been performed to attribute a score to each sentence within a sample of 150 end-users' maintenance The tap leaks water tap leaks water -0.4 2 The tap leaks a lot of water tap leaks lot water -0.6 3 The tap leaks a river of water tap leaks river water -1 4 The tap leaks a little amout of water tap little leaks little amount water -0.1 Tab. 1. Original and reduced sentences, with the attributed scores. Reduced sentences were obtained with a stopword remotion process.
Then the DTM (document term matrix) was built. A DTM is a matrix expressing each word's presence or absence in each sentence (Tab. 2). Finally, equations 20-24 were applied, obtaining frequencies and Q values for each of the words comprised in the dataset. The little differences that we found between "little", "lot", and "river" can be explained by the very Then, the mean of the scores attributed by the remaining annotators was calculated for each sentence. Finally, the 150 sentences were ordered on a scale of -1 (high priority) and 1 (low priority). Figure 2 shows the distribution of the scores obtained through the manual annotation process based on the BWS approach. As expected, the scores are normally distributed.

APPLICATION OF THE PROPOSED METHOD TO THE CASE STUDY
To show the application of the proposed method, an R script has been written, and the Q values were calculated for the words contained in a limited set of sentences, starting from the dataset of sentences previously analyzed and applying the equations described in the previous Section 3. Four sentences of the dataset were adapted to have a limited set of words to calculate frequencies and Q values. Table 1 shows the sample sentence set and the scores attributed to each sentence. After a preliminary remotion of the stopwords (common words not expressing specific opinions), symbols and punctuations, performed using the TM package, the phrases were reduced to a limited set of words repeated in the sentences. ed (end-users requests) could be simply pre-processed to calculate the priority score thanks to the proposed technical sentiment lexicon. Then, collected data can be then screened by technicians once the automatic assignment process has been performed, by then (1) supporting the quick identification of the most urgent needs that can cause interruptions of services for users, as well as safety and discomfort issues, and (2) allowing technicians to focus on the next management tasks. Such a kind of aid could be fundamental, especially for facility management in large and complex organizations. Technicians could then focus on the deployment of corrective actions, e.g. by detecting and assigning which staff members' expertise is needed to solve the fault. In this sense, future works could also employ ML methods to support other maintenance tasks, e.g. work category assignment, thus further reducing manual effort by technicians and boosting fault solutions.
limited dimension of the sample chosen to show the application of the proposed method. Only with a very large number of sentences is it possible to calculate frequencies, not depending on the specific dataset, and to reach enough accuracy to realize a general technical lexicon.

CONCLUSION
This work proposes a method to use polarity scores attributed to a set of annotated end-user maintenance requests to extract the "specific polarity content" of each word contained in the dataset. The technical lexicon is useful to quickly predict the priority to assign to contemporary end-users' maintenance requests with respect to traditional ML methods.
The work is based on the end-users' maintenance requests collected into a CMMS by a general facility manager contractor. From a dataset containing more than 12000 end-users' maintenance requests, 150 sentences were extracted, and a BWS manual annotation approach was applied. The manual annotation was performed by 20 annotators obtaining priority scores for each sentence ranging from -1 (high priority) to 1 (low priority). Then a mathematical reformulation of the naïve Bayes classification approach has been proposed to extract from the annotated sentences the priority expressed by each word. Concerning the other common ML methods, the proposed approach, based on SA methods, can give affordable results with very limited use of computational resources.
The proposed approach has been applied to a limited sample dataset to show the validity of the approach. Following the proposed method, future works will transform data collected through CMMSs to create a complete technical sentiment lexicon able to predict the priority sentiment expressed by occupants through unstructured end-users' maintenance requests. This effort will ensure applying the lexicon and the proposed methodology to building maintenance and management tasks in complex built environments.
The real-world application could take benefit from the outcomes of this work. Relational databases used in commercial CMMs could be easily integrated with the proposed methodology, and thus the textual data collect-