Monday, January 27, 2020

Algorithm For Segmentation Of Urdu Script English Language Essay

Algorithm For Segmentation Of Urdu Script English Language Essay Segmentation of script plays a vital role in script recognition. It is vital to understand the script that is used in writing a document before developing or using a model to recognize it. Chain codes etc. In ligature model, word model is used at document, page and word level for segmentation. Our algorithm for segmentation of Urdu script used character model and Hidden Markov Model (HMM) to enhance work done previously. We have extracted features from images and calculated the maximum likelihood to match characters in inference algorithm with a feature extracted from a text sample. The main features used in the system will be pre-processing, connected component analysis, recognition and segmentation of text up to character level. The algorithm will provide a means to implement an Urdu OCR system on the basis of the character model. Keywords Preprocessing, Segmentation of characters, character model, Optical character recognition (OCR), max and argmax. Introduction We use an OCR system / scanner to get images of text [1]. Into preprocessing image will be converted to noiseless B/W image. 1.1 Segmentation Segmentation is dividing an image into smaller segments or pieces [2]. Segmentation occurs on two levels. At first level both text and graphics are separated for further processing. At second level, segmentation is performed on text to separate paragraphs, words, and characters etc. Segmentation of text can be performed on a document, page, paragraph and character levels [3]. They suggested various segmentation approaches namely [4]. Holistic Method Segmentation based approach Segmentation free approach In holistic method whole word is classified using a dictionary, the features of test input are matched against trained prototypes [5]. The limitation is that the method is not good for larger classes and it can only be used with the other two methods. Segmentation divides a word into smaller segments. The image of the word is broken up into several entities called graphemes [4]. Segmentation depends on human intuition. In segmentation free approach character model can be used to concatenate characters and form words. For instance segmentation free approach can be based on Hidden Markov Model (HMM) that is a stochastic model. 1.2. Urdu Language and Text Segmentation Urdu is a cursive (written with the characters joined) writing language. Urdu language characters are similar in shape and have curves that make it difficult to recognize by a machine. Moreover it has more than one symbol to represent a character. Due to its cursive nature characters / scripts in Urdu language are hard to recognize by a computer program. A very accurate technique is needed to recognize / understand Urdu characters. Urdu characters have four elementary shapes Basic Symbols (38 Symbols) Table 1 shows the basic symbols / shapes for Urdu Language. Beginning Symbols (26 Symbols) Table 2 shows the basic symbols / shapes for Urdu Language. Mid Symbols (40 Symbols) Table 3 shows the basic symbols / shapes for Urdu Language. Other Symbols This includes symbols for numbers, special symbols like zabar, zair, paish etc. The symbol tables, Table 1, Table 2, Table3 and Table 4, for Urdu language are given below as: Table1. Basic Symbols Table 2. Beginning Symbols Table 3. Mid Symbols Table 4. Other Symbols We used Urdu script Nastaliq for our work. We extracted images for Urdu character set like basic, beginning, mid and other symbols using available Nastaliq font. Literature Review In a structural approach to script identification, stroke geometry has been utilized for script characterization and identification [6]. Individual character images in a document are classified either by applying a prototype classification or by using support vector machine. Ligatures are used for segmentation / recognition of Urdu characters. The ligature is a sequence of characters in a word separated by non-joiner characters like space. Their approach in [1] used ligature model and it is divided into two stages: Line Segmentation Line segmentation deals with the detection of text lines in the image. The image is scanned horizontally from right to left direction, upwards to downwards, in search of a text pixel. Afterwards, it is determined whether this pixel belongs to a primary ligature or a secondary ligature as shown in Fig 1. The freeman chain codes (FCC) of the ligature are compared with already calculated FCC of the secondary ligatures. Character Segmentation The text is skeletonized and a label matrix is constructed which contains the identifiers of all ligatures in the image. The position of individual characters in a word is determined. Segmentation is done using primary ligatures only. Fig 1. (a) Urdu word (b) Seven ligatures (c) Three Primary ligatures (d) Four Secondary ligatures [7]. Limitations of the method are: firstly, they performed segmentation on the basis of primary ligatures only, therefore, it will not differentiate between seen and sheen because it will ignore secondary ligatures i.e. dots. Secondly, dictionary of images stored for training will be huge. Thirdly, there are problems of over segmentation and under segmentation. In [8], they have proposed a ligature and word model for Urdu word segmentation. It was done in three phases: In 1st phase, data is collected. They identified Ligatures and calculated word probabilities using probabilistic measure. From the input set of ligatures, all sequences of words are generated and ranked using the lexicon lookup. In the 2nd phase, top k sequences are selected using a selected beam value for further processing. It uses valid words heuristic for selection process. In the third phase, maximum probable sequence from these k word sequences is selected. Their method used dictionary of ligatures/words, chain codes, and to find best probable sequences they used HMM toolkit HTK to recognize a word / ligature. They have recommended that their work can be further improved by using the character model for Urdu text segmentation [9]. A poor segmentation will lead to poor recognition [10]. They divided image into smaller blocks, check for uniformity, group uniform block using color similarity and identify text in this block [11]. They used edge density based noise detection to segment out text areas in video/ images [12]. Segmentation of an image into text and non-text regions effect performance in OCR development [13]. They proposed line segmentation method using histogram equalization, indicated various problems and text line into ligature using chain codes [14]. They presented bounding box based approach for segmentation of table of contents in Urdu script [15]. They analyzed horizontal and vertical projection profiles for line and character segmentation. Misclassification occurs at character level [16]. They proposed text line extraction using vertical projection, marking all points where pixel values are not found and text line into ligatures using stroke geometry [17]. They proposed identification of partial words (i.e. connected components) in text line and using horizontal / vertical projections to identify words using relative distance matching [18]. They used dictionary for text line and ligature segmentation in online text [19]. Problem Statement Previous work has limitations that it cannot correctly perform segmentation in few cases and there will be misclassification problems. Moreover it can recognize a limited set of connected components or ligatures only. Proposed Segmentation Algorithm We will enhance previous work by proposing an improved algorithm for Urdu script segmentation that will use a character model. For this purpose we have created a set of characters. There are about 114 characters excluding some special characters like zabar, zair, paish etc. We have used characters of fixed size and style in this work. We are using all the variations of each character in a writing style e.g. bay has three shapes a basic, a beginning and mid shapes. Our algorithm uses a character model with Hidden Markov Models (HMMs) for segmentation of Urdu text. To the best of our knowledge, this work has not been done previously. We have offline text i.e., scanned pre-processed B/W Urdu characters and we are using Matlab ver. 7.12 as programming tool. 4.1 Our Method Our method is divided into three broad steps: Step#1 Data Acquisition / Feature Extraction: In the first step, algorithm transforms images of symbols into binary form as a matrix. Then extract features from the images using our feature extraction program and store it into a disk. These features are represented as hidden states: X(i) = { x(0), x(1), . . . , x (k)} where each X (i) represents a feature (in matrix form) for each shape in an Urdu character set; x (k) is a position vector in the matrix X (i). Step#2 Get Observed data: The observed data contain sequences of Urdu characters. In our study we have used a line of Urdu text. After acquiring this filtered image, we have transformed it into binary form. Then extracted features from an image using our feature extraction program. This feature contains several Urdu characters in it. The algorithm will scan it and perform segmentation by calculating maximum probabilities with hidden states and locating observations in feature using HMMs. These observations form observable states: O(i) = { o(0), o(1), . . . , o(k)} where each O(i) represents feature (in matrix form) for each shape in observed states; o(k) is a positional vector in matrix O(i). Step#3 Apply HMMs: We are given: Hidden states: X(i) = { x(1), x(2), . . . , x(k)} where i = 1,2, †¦ , m (for m characters). Observable states: O(i) = { o(1), o(2), . . . , o(k)} where i = 1,2, †¦ , n. Initial Distribution X(0). In a hidden Markov model the state variable x(i) is observable only through its measurements o(i). Now, suppose that a sequence O(i) of emission has been observed. Fig 2 shows transformation of a character and an observed sequence that are captured using MATLAB matrices. (a) (b) Fig 2: (a) A m x n matrix showing Urdu character Alif. (b) Sample observation showing a connected component of two characters bay and alif spelled out ba. Instead of using characters our algorithm extracted features from all the characters to reduce computation complexity. These features will be used as hidden states in HMM i.e. x(i) and are stored on disk for example, features showing character alif and bay, captured using MATLAB, are shown below in fig 3. (a) (b) (c) Fig 3. (a) Feature for character Alif, (b) Feature for character Bay and (c) Feature for sample S(i) taken from word ba i.e. bay-alif. The algorithm extracts feature from line of sample text S(i). In forward algorithm, the feature s(1), †¦ , s(k) is matched against each of the hidden states x(i) by matching rows of x(i) with rows of S(i). The process continues for all characters and stops after calculating probabilities for all the characters i.e. P(X(i)|Z(i)). Afterwards it finds the maximization of probability and in this way it finds observation O(1) from the S(1). The forward algorithm will continue from s(k+1), †¦, s(L) to find observations O(2), †¦ , O(n). If there is more than one probable character, then we can use a so called Viterbi algorithm that will find argmax and will give the optimal probable sequence if we are not near to actual results. The algorithm for the HMMs is as under: Algorithm Segsha (S, L) j=1 while ( j < L ) for i = 1 to n Sample s(j) ~ {w} wi = pr(s(j)|X(i)) end-for O(i) = O(i) U {max( wi )} s(j) = s(j) + 1 end-while Where S is a sample feature of vectors obtained from an observed sequence O(i) i.e., a line of Urdu text; L is the dimension of S (length of S); S(j) is a sample taken from S each time to match against character feature X(i) and probability of matching will give us weights, wi, for each character; max(wi) is maximization of probability that proceeds as follows: Here max(wi) can be calculated by comparing wi ~ w and calculated by using the eq.1 [20]. Result A total of 1200 words were used that include all the characters in our character set. Sample scanned text was taken from Nastaliq font with point size 36. We found that 1176 out of 1200 were completely recognized. Not the whole word but only one or two characters in a word were misclassified. The accuracy of 97% was very encouraging for us and we are looking forward to work further in this area. Conclusion We tested our approach on images of text taken from Nastaliq font scanned at 300 dpi and found that better results can be achieved by using HMM with the character model. These results were checked on a prototype using a set of characters. We have achieved 97% accuracy. Future Work and Enhancements In future we are planning on two things: 1. To eliminate restriction of fixed font size and style. 2. To work with handwritten Urdu text. We will use both of the options using the same method but that is another story.

Sunday, January 19, 2020

Assessing the Goal of Sports Products, Inc Essay

Sports Products Inc. is a large producer of boating equipments and accessories. The two key players within this organization is Loren Segura who works as a Clerical assistant in the accounting department and Dale Johnson who works in the shipping department. Both team members had a concern about the company profits and was equally concerned about the stocks declining in value therefore, Loren and Dale try to strategize what is important to management and how the current options affect their pay directly. (Gitman,2009) Solution a. What should the management of Sports Products, Inc. pursue as its overriding goal? Why? Sports Products Inc. will definitely want to maximize their shareholders wealth, which should be the most important goal of an organization although; profit is required to increase the dividends of the company. The managers in Sports Products Inc. must focus on how the organization will continue to profit however; shareholders wealth will increase or maximize while they focus on maintaining their status of providing excellent boating equipment and accessories to their clientele. The firm will also need to come up with a way to incorporate pollution control for the existing problems and a way to pay the additional cost it will incur. The study indicates that the firm has never paid any cash dividends in their twenty-year history and this is how stockholders receive their profit from the organizations earnings. Shareholders fall secondary when it comes to receiving cash dividends or profit because, a shareholder only profits after everyone else in line has received their payments such as the organizations creditors, or suppliers which explains why Sports Products Inc. is being sued by various officials for dumping waste in adjacent streams. The company has chosen not invest in paying for pollution control as this will increase cost to the company and lower the company profit margin. By the shareholders, owning the firm places them at a greater risk and by them owing other companies for risking pollution no one will want to invest in the company although, the profits are rising there is no increase in the firm’s stock price. b. Does the firm appear to have an agency problem? Explain. There does appear to be an agency problem because, regardless of Dales and Loren efforts to manage their jobs by trying not to waste packaging material and performing their job as cost-effective as possible the stock price is still declining $2 per share over a 9 month period which is a large decline under a year time-frame. The company also, does not seem to be concerned about incorporating a pollution control program because; the company is concerned over the cost to themselves and their company profit margin. c. Evaluate the firm’s approach to pollution control. Does it seem to be ethical? Why might incurring the expense to control pollution be in the best interests of the firm’s owners despite its negative effect on profits? To be honest, I am unsure why this would happen ethically. Sports Products Inc. will eventually have to take responsibility on a higher level if these other companies go through with the lawsuits. Therefore, the organization will be forced into either incorporating a pollution control plan or paying fines, which will reduce shareholders wealth even more because, at this point the shareholders cannot receive anything until their creditors are paid in full. d. Does the firm appear to have an effective corporate governance structure? Explain any shortcomings. The structure of Sports Products Inc. appears poorly structured. The management teams are not focused on the shareholders wealth at all. The management structure wants to maintain company profit to break even however, they are not concerned about dumping waste into streams or, creating a pollution control plan. The company is not assuring their stockholders wealth is maximized and if they have not paid cash dividends in 20 years they are just trying to stay in business however, they are not taking care of their employees who work from them everyday nor, does the company have the shareholders best interest at heart. e. On the basis of the information provided, what specific recommendations would You offer the firm? Based on the case study I would recommend Sports Products Inc. forming a better plan that will not just break even however, strategize how to incorporate a pollution control program that will be cost-effective and not affect profits if possible. I would recommend that they incorporate better ethical values that will show integrity to their constituents and internal employees. The organization will need to continue to profit but they also, need to ensure that the shareholders get a piece of the pie in addition, to changing the standards that have been in place for 20 years.

Saturday, January 11, 2020

Love at First Sight Essay

Love at First Sight Writing Sample Once upon a time there was a girl. One day she saw a boy she’d never met across a crowded room. Their eyes locked: she froze in her tracks, her face stuck in awe. Her blood ran cold; her fingers began to tingle as a shiver ran through her entire body. 8.2 seconds later the boy flashed her a beaming smile. His expression injected a flood of warmth into her fragile heart and her mouth involuntarily turned up to return the gesture. She didn’t know how or why but she knew at that moment that this boy was the one. This is the true and universal story of a phenomenon known as love at first sight. When I was a child I used to shamble after my mom around the house asking her â€Å"Mommy, what’s it like to be in love?† she always sat me down and answered â€Å"It’s nothing I can explain, sweetie, you’ll know it when you feel it†. How could this be? How could an experience be so complex it can’t be described in words? How on earth could this happen with one look? Science says it’s simple: it’s all in our biological makeup. In a recent article published in Psychology Today; John R. Buri, Ph.D. describes that when we experience an â€Å"instant attraction† neurotransmitter chemicals are released into our nervous system stimulating a powerful â€Å"physiological arousal†. But how far does this stimulant take us? We all know what it’s like to encounter a â€Å"hot† boy or girl on any regular day but this exciting meeting is usually easily forgotten and rarely affects us in any way besides providing topic of conversation among friends (â€Å"Have you seen that new cashier? He is fine! And he totally checked me out today†). Some may say that this brief glitch of pleasure is all that will ever result from a first meeting, but stories all around us attest to something greater. A submission to the PBS segment â€Å"American Love Stories† reads â€Å"I met my husband in an emergency room while he was doing a medical school rotation. I was being  treated for a migraine headache. From twenty-five feet away and despite numerous interruptions, including my pain, our eyes locked, and we married a little over a year later.† This is just one of the tales that pop-up all around us converting the emotionally willing to hopeless romance. The question we must ask, though, is how much of this phenomenon is rooted in fairytales and how much is it rooted in science? In an experiment recently conducted by Cornell University on a sample of fruit flies, female fruit flies were able to sense, upon first encounter, males of the same species that were genetically capable of producing more offspring with them than other males that weren’t. The scientists explained this result by concluding that the female flies were innately â€Å"wired for love† and â€Å"the chemicals and proteins needed for their response [were] already in place, without the need for new genes to be activated†. Though there are differences between the genetics in humans and fruit flies, the same principles may apply. Clara Moskowitz, author of the article â€Å"Love at First Sight Might be Genetic†, refers to an experiment where humans were more attracted to the scents emitted from T-shirts that were not of those who were genetically related to them, proving that human bodies have a natural instinct that prevents inbreeding and is able to â€Å"senseâ₠¬  their better match. It’s hard to imagine the amazing complexity of the human mind and feelings but a lot of people put all of their faith or belief into something they can’t even see or understand. In the article â€Å"Love at First sight† Psychology today reveals that approximately 60% of Americans believe in love at first sight. This might be due to the fact that over 50% said they have experienced it. Whether or not one â€Å"believes† in love at first sight, it’s no question that humans are scientifically capable of it. Our culture is surrounded by the magical idea of true love and impossibly romantic fairytales that seem too good to be true; but maybe the reason these stories seem so out of reach is because they have an outrageous take on relationships and the circumstances in which they develop. So what is love? A romantic duet in a pond under a star-sprinkled sky? A brave, handsome prince rescuing a gorgeous, innocent damsel in distress from a fire-breathing dragon? A happily ever after? Most would have a hard time defining something as mysterious as love, but with the burst of technology in the last decade, scientists have uncovered explanations for more than ever thought possible. Judith Newman investigates her heart out in the Parade Article â€Å"The Science of Love†, breaking down the concept into three chemicals in the brain that each contribute to a different piece of the love puzzle. The first, dopamine, is connected to the addictive feeling of pleasure one may feel around someone they love. Norepinphrine, the second neurotransmitter released, causes the jitters and nerves that result from being in love. The third, Serotonin, balances out the norepinphrine by releasing a calming chemical into the brain. These three transmitters release enough â€Å"mix of emotions† into the body to cause the sensation we know as love. As scientists discover more and more about humans, more and more is revealed about how we were biologically constructed to find a life-long partner. And if love really is just a release of fancy brain chemicals, it’s likely that they can work fast enough to be triggered at first sight; we are pretty smart after all. To make the claim that love is all mental is, well, plain mental; yet to say it is scientifically impossible is just as crazy. It’s plain to see that love happens all around us and most importantly when we’re not expecting it. Not everything can be explained by science, even when it comes to biological instinct, but sometimes a simple meeting of the eyes or a flash of a genuine smile explains it all. Works Cited 1. Love, Home /. â€Å"Love at First Sight, Blind to the Future.† PBS: Public Broadcasting Service. Web. 15 Feb. 2012. . 2. Moskowitz, Clara. â€Å"Love at First Sight Might Be Genetic | LiveScience.† Live Science. 08 Apr. 2009. Web. 15 Feb. 2012. . 3. Buri, Ph.D, John R. â€Å"Love At First Sight.† Psychology Today. 16 Feb. 2010. Web. 15 Feb. 2012. . 4. Newman, Judith. â€Å"The Science of Love.† Parade 12 Feb. 2012: 9+. Print

Thursday, January 2, 2020

Push-Pull Factors that Determine Population Migration

In geographical terms, the push-pull factors are those that drive people away from a place and draw people to a new location. A combination of push-pull factors helps determine migration or immigration of particular populations from one land to another. Push factors are often forceful, demanding that a certain person or group of people leave one country for another, or at least giving that person or people strong reasons to want to move—either because of a threat of violence or the loss of financial security. Pull factors, on the other hand, are often the positive aspects of a different country that encourage people to immigrate in order to seek a better life. While it may seem that push and pull factors are diametrically opposed, in fact they both come into play when a population or person is considering migrating to a new location. Push Factors: Reasons to Leave Any number of detrimental factors can be considered push factors, which essentially force a population or person from one country to seek refuge in another country. Conditions which drive people to leave their homes can include a sub-standard level of living, food, land or job scarcity, famine or drought, political or religious persecution, pollution, or even natural disasters. Under the worst circumstances, it may be difficult for a person or group to pick and choose a destination: speed out is more important than selecting the best option for relocation. Although all push factors dont require a person to leave a country, these conditions that contribute to a person leaving are often so dire that if they do not choose to leave, they will suffer financially, emotionally or physically.  The Great Potato Famine of the mid-19th century, for example, pushed thousands of Irish families to immigrate to the United States to avoid starvation. Populations with refugee statuses are the among the most affected by push factors in a country or region. Refugee populations are often faced with genocide-like conditions in their country of origin, usually because of authoritarian governments or populations opposed to religious or ethnic groups. For example, Jews leaving Germany during the Nazi era were threatened with violent death if they remained in their home country. Pull Factors: Reasons to Migrate Pull factors are those that help a person or population determine whether relocating to a new country would provide a significant benefit. These factors attract populations to a new place largely because of what the country provides that is not available to them in their country of origin. A promise of freedom from religious or political persecution, availability of career opportunities or cheap land, and an abundance of food could be considered pull factors for migrating to a new country. In each of these cases, a population will have more opportunity to pursue a better life compared to its home country. Students entering universities or seeking jobs in more developed countries, for example, might be able to receive larger salaries and greater opportunities than in their countries of origin. For some individuals and groups, push and pull factors work together. This is particularly the case when push factors are relatively benign. For example, a young adult who cannot find a lucrative job in her home country may consider immigrating only if the opportunities are significantly better elsewhere. Sources and Further Reading Baldwin-Edwards, Martin, and Martin A. Schain. The Politics of Immigration in Western Europe. London: Routledge, 1994.  Horevitz, Elizabeth. Understanding the Anthropology of Immigration and Migration. Journal of Human Behavior in the Social Environment 19.6 (2009): 745–58.  Portes, Alejandro, and Jà ¶zsef Bà ¶rà ¶cz. Contemporary Immigration: Theoretical Perspectives on Its Determinants and Modes of Incorporation. International Migration Review 23.3 (1989): 606–30.  Zimmermann, Klaus F. European Migration: Push and Pull. International Regional Science Review 19.1–2 (1996): 95–128.